HDFGroup / vol-cache

HDF5 Cache VOL connector for caching data on fast storage layers and moving data asynchronously to the parallel file system to hide I/O overhead.
https://vol-cache.readthedocs.io
BSD 3-Clause "New" or "Revised" License
18 stars 8 forks source link

Valgrind reported leaks and seg fault #27

Open brtnfld opened 4 months ago

brtnfld commented 4 months ago

When running vol-log-based tests with vol-cache, Valgrind reports these leaks:

==2210== 272 bytes in 1 blocks are definitely lost in loss record 2,546 of 2,971
==2210==    at 0x483D611: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2210==    by 0x591A9C9: my_calloc (debug.c:151)
==2210==    by 0x5909A15: H5VL_cache_ext_info_copy (H5VLcache_ext.c:1184)
==2210==    by 0x4F6DA3A: H5VL_copy_connector_info (in /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000.0.0)
==2210==    by 0x4D4AF45: H5Pget_vol_info (in /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000.0.0)
==2210==    by 0x591054B: set_file_cache (H5VLcache_ext.c:4017)
==2210==    by 0x5910B7F: H5VL_cache_ext_file_create (H5VLcache_ext.c:4138)
==2210==    by 0x4F7CA57: H5VL__file_create (in /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000.0.0)
==2210==    by 0x4F7CC4F: H5VL_file_create (in /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000.0.0)
==2210==    by 0x4AA5910: H5F__create_api_common (in /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000.0.0)
==2210==    by 0x4AA5C32: H5Fcreate (in /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000.0.0)
==2210==    by 0x401125: main (attr.cpp:69)
==2210== 
==2210== 304 bytes in 1 blocks are definitely lost in loss record 2,614 of 2,971
==2210==    at 0x4838744: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2210==    by 0x591A994: my_malloc (debug.c:140)
==2210==    by 0x5918805: get_H5LS_mmap_class_t (H5LS.c:68)
==2210==    by 0x590A1E0: H5VL_cache_ext_str_to_info (H5VLcache_ext.c:1472)
==2210==    by 0x4F9A1FE: H5VL__connector_str_to_info (in /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000.0.0)
==2210==    by 0x4F94D46: H5VL__set_def_conn (in /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000.0.0)
==2210==    by 0x4F93FC1: H5VL_init_phase2 (in /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000.0.0)
==2210==    by 0x48AF85C: H5_init_library (in /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000.0.0)
==2210==    by 0x48B1369: H5open (in /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000.0.0)
==2210==    by 0x40103A: main (attr.cpp:44)
==2210== 
==2210== 352 bytes in 1 blocks are definitely lost in loss record 2,803 of 2,971
==2210==    at 0x4838744: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2210==    by 0x591A994: my_malloc (debug.c:140)
==2210==    by 0x590A178: H5VL_cache_ext_str_to_info (H5VLcache_ext.c:1468)
==2210==    by 0x4F9A1FE: H5VL__connector_str_to_info (in /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000.0.0)
==2210==    by 0x4F94D46: H5VL__set_def_conn (in /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000.0.0)
==2210==    by 0x4F93FC1: H5VL_init_phase2 (in /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000.0.0)
==2210==    by 0x48AF85C: H5_init_library (in /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000.0.0)
==2210==    by 0x48B1369: H5open (in /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000.0.0)
==2210==    by 0x40103A: main (attr.cpp:44)

Also, seeing seg faults

#0  0x00007ffff6ef791c in H5VL_cache_ext_term () at /home/brtnfld/packages/vol-cache/src/H5VLcache_ext.c:1154
#1  0x00007ffff7dd73dc in H5VL__free_cls () from /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000
#2  0x00007ffff7a6fb28 in H5I__dec_ref () from /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000
#3  0x00007ffff7a6fce7 in H5I_dec_ref () from /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000
#4  0x00007ffff7dd8d30 in H5VL_conn_free () from /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000
#5  0x00007ffff7dd7110 in H5VL_term_package () from /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000
#6  0x00007ffff76f2aa4 in H5_term_library () from /home/brtnfld/packages/hdf5/buildc/hdf5/lib/libhdf5.so.1000
#7  0x00007ffff6bcdbd9 in __run_exit_handlers () from /lib64/libc.so.6
#8  0x00007ffff6bcdd6a in exit () from /lib64/libc.so.6
#9  0x00007ffff6bb5254 in __libc_start_main () from /lib64/libc.so.6
#10 0x0000000000400eda in _start () at ../sysdeps/x86_64/start.S:120
brtnfld commented 4 months ago

The segfault issue seems to be in H5VL_cache_ext_str_to_info(). I'm not sure this needs to be done here, but instead, it should be done at the point it is needed.

  p->next = (H5LS_stack_t *)malloc(sizeof(H5LS_stack_t));
  p = p->next;
  p->next = NULL;

All the vol-log tests pass with this removed.