libocca / occa

Portable and vendor neutral framework for parallel programming on heterogeneous platforms.
https://libocca.org
MIT License
386 stars 82 forks source link

Free allocations made by initializer #77

Closed jedbrown closed 6 years ago

jedbrown commented 6 years ago

These memory leaks create valgrind noise for users that link to libocca, even if they never call into the library.

==14201== HEAP SUMMARY:
==14201==     in use at exit: 232 bytes in 7 blocks
==14201==   total heap usage: 35 allocs, 28 frees, 111,478 bytes allocated
==14201== 
==14201== 8 bytes in 1 blocks are definitely lost in loss record 2 of 7
==14201==    at 0x4C2D52F: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==14201==    by 0x58B2F2F: occa::env::registerFileOpeners() (in /home/jed/src/occa/lib/libocca.so)
==14201==    by 0x58B5B9C: occa::env::initialize() (in /home/jed/src/occa/lib/libocca.so)
==14201==    by 0x400F519: call_init.part.0 (in /usr/lib/ld-2.26.so)
==14201==    by 0x400F625: _dl_init (in /usr/lib/ld-2.26.so)
==14201==    by 0x4000F69: ??? (in /usr/lib/ld-2.26.so)
==14201== 
==14201== 8 bytes in 1 blocks are definitely lost in loss record 3 of 7
==14201==    at 0x4C2D52F: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==14201==    by 0x58B2F4C: occa::env::registerFileOpeners() (in /home/jed/src/occa/lib/libocca.so)
==14201==    by 0x58B5B9C: occa::env::initialize() (in /home/jed/src/occa/lib/libocca.so)
==14201==    by 0x400F519: call_init.part.0 (in /usr/lib/ld-2.26.so)
==14201==    by 0x400F625: _dl_init (in /usr/lib/ld-2.26.so)
==14201==    by 0x4000F69: ??? (in /usr/lib/ld-2.26.so)
==14201== 
==14201== 8 bytes in 1 blocks are definitely lost in loss record 4 of 7
==14201==    at 0x4C2D52F: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==14201==    by 0x58B2F69: occa::env::registerFileOpeners() (in /home/jed/src/occa/lib/libocca.so)
==14201==    by 0x58B5B9C: occa::env::initialize() (in /home/jed/src/occa/lib/libocca.so)
==14201==    by 0x400F519: call_init.part.0 (in /usr/lib/ld-2.26.so)
==14201==    by 0x400F625: _dl_init (in /usr/lib/ld-2.26.so)
==14201==    by 0x4000F69: ??? (in /usr/lib/ld-2.26.so)
dmed256 commented 6 years ago

Oops, accidentally pressed the 'Close and comment'

I'll look into it this weekend, thanks! There are also tons of memory leaks with the parser atm, which is getting re-written Feel free to send any more memory issues :)

dmed256 commented 6 years ago

Just pushed a change that clears the fileOpeners in the global destructor

https://github.com/libocca/occa/blob/1.0/src/tools/env.cpp#L205-L212

I'll test it in the Linux machine soon, but it should be cleaning up the allocations

Example:

Compiling example [addVectors/cpp]
clang++ -g -o /Users/dsm5/git/night/examples/addVectors/cpp/main  -Wno-deprecated-declarations  /Users/dsm5/git/night/examples/addVectors/cpp/main.cpp -L/Users/dsm5/git/night/lib -I/Users/dsm5/git/night/include    -locca -framework accelerate -framework CoreServices -framework OpenCL
==========o======================o==========================================
 CPU Info | Processor Name       | Intel(R) Core(TM) i5-4260U CPU @ 1.40GHz 
          | Cores                | 4                                        
          | Memory (RAM)         | 4 GB                                     
          | Clock Frequency      | 1.4 GHz                                  
          | SIMD Instruction Set | SSE4                                     
          | SIMD Width           | 128 bits                                 
          | L1 Cache Size (d)    |  32 KB                                   
          | L2 Cache Size        | 256 KB                                   
          | L3 Cache Size        |   3 MB                                   
==========o======================o==========================================
 OpenCL   | Device Name          | Intel(R) Core(TM) i5-4260U CPU @ 1.40GHz 
          | Driver Vendor        | Intel                                    
          | Platform ID          | 0                                        
          | Device ID            | 0                                        
          | Memory               | 4 GB                                     
          |----------------------|------------------------------------------
          | Device Name          | HD Graphics 5000                         
          | Driver Vendor        | Intel                                    
          | Platform ID          | 0                                        
          | Device ID            | 1                                        
          | Memory               | 1 GB                                     
==========o======================o==========================================
0: 1
1: 1
2: 1
3: 1
4: 1
openers.size() = 0
jedbrown commented 6 years ago

Thanks. As compared to OCCA 1.0 from a few days ago, the build fails with gcc (works with clang) and libceed serial OCCA tests now segfault (don't know if that is an OCCA or libceed problem, but they used to work).

#1  0x00007ffff70fe583 in occa::serial::kernel::runFromArguments(int, occa::kernelArg const*) const () from /home/jed/src/occa/lib/libocca.so
#2  0x00007ffff714cef3 in occa::kernel::runFromArguments() const () from /home/jed/src/occa/lib/libocca.so                                                                                    
#3  0x00007ffff7139b51 in occaKernelRunN () from /home/jed/src/occa/lib/libocca.so                                                                                                            
#4  0x00007ffff7139c8d in occaKernelRun3 () from /home/jed/src/occa/lib/libocca.so                                                                                                            
#5  0x00007ffff7bcad75 in CeedElemRestrictionApply_Occa (r=0x55555576beb0, tmode=CEED_NOTRANSPOSE, ncomp=1, lmode=CEED_NOTRANSPOSE, u=0x55555576bb40, v=0x555555ddb5a0, request=0x7ffff7dcf200 <ceed_request_immediate>) at /home/jed/src/libceed/backends/occa/ceed-occa-restrict.c:169
#6  0x00007ffff7bc641d in CeedElemRestrictionApply (r=0x55555576beb0, tmode=CEED_NOTRANSPOSE, ncomp=1, lmode=CEED_NOTRANSPOSE, u=0x55555576bb40, v=0x555555ddb5a0, request=0x7ffff7dcf200 <ceed_request_immediate>) at /home/jed/src/libceed/ceed-elemrestriction.c:116
#7  0x0000555555554df5 in main (argc=2, argv=0x7fffffffdb38) at /home/jed/src/libceed/tests/t05-elemrestriction.c:25   
dmed256 commented 6 years ago

Oh, the kernels are probably not being recompiled which makes the binary loading faulty I updated the parser version that will force them to recompile if you have time to try again

jedbrown commented 6 years ago

No luck. Same error.

dmed256 commented 6 years ago

Oops again, clang hid an error and Travis CI (+ you) caught it Fixing right now

dmed256 commented 6 years ago

It was an error with types shadowing variable names when I was fixing styling issues :/

Fail with g++: https://travis-ci.org/libocca/occa/jobs/321051641 Pass with g++: https://travis-ci.org/libocca/occa/jobs/321057521

dmed256 commented 6 years ago

I managed to get it compiling and running with g++, let me know if you still have issues

[~/git/ceed/examples/mfem]
> mcm
rm -f *~ ex1
rm -rf *.dSYM *.TVD.*breakpoints
/usr/local/bin/g++-6 -I../..  -O3 -I../../../mfem -I/Users/dsm5/git/night/include ex1.cpp -o ex1 -Wl,-rpath,/Users/dsm5/git/ceed -L../.. -lceed \
      -L../../../mfem -lmfem -L/Users/dsm5/git/night/lib -locca -Wl,-rpath,/Users/dsm5/git/night/lib -L/Users/dsm5/git/night/lib -locca
[~/git/ceed/examples/mfem]
> ./ex1
Options used:
   --ceed-spec /cpu/self
   --mesh ../../../mfem/data/star.mesh
   --order 1
   --visualization
Number of finite element unknowns: 20801
   Iteration :   0  (B r, r) = 0.000892612 ...
   Iteration :  18  (B r, r) = 6.77949e-16
Average reduction factor = 0.460626
L2 projection error: 7.34603e-06
jedbrown commented 6 years ago

/cpu/self is handled by the reference backend so you'd need --ceed-spec /cpu/occa for example.

Or just run make prove or build/t05-elemrestriction /cpu/occa.

dmed256 commented 6 years ago
> ./ex1  --ceed-spec '/cpu/occa'
Options used:
   --ceed-spec /cpu/occa
   --mesh ../../../mfem/data/star.mesh
   --order 1
   --visualization
Number of finite element unknowns: 20801
   Iteration :   0  (B r, r) = 0.000892612 ...
   Iteration :  18  (B r, r) = 6.77949e-16
Average reduction factor = 0.460626
L2 projection error: 7.34603e-06
jedbrown commented 6 years ago

Working now, thanks.

dmed256 commented 6 years ago

Awesome, thanks for the fast feedback loop!