tomeichlersmith / hdtree

columnar, ragged data with a dynamic, runtime-defined schema
https://tomeichlersmith.github.io/hdtree/
2 stars 1 forks source link

Leak during branch read #1

Closed tomeichlersmith closed 1 year ago

tomeichlersmith commented 1 year ago

Test runs and passes but we get a leak reported at the end of processing.

=================================================================
==580455==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 15 byte(s) in 3 object(s) allocated from:
    #0 0x7f4fc3d75867 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x7f4fc31c750f  (/lib/x86_64-linux-gnu/libhdf5_serial.so.103+0x29d50f)

SUMMARY: AddressSanitizer: 15 byte(s) leaked in 3 allocation(s).

I have commented out various different Branches and I think the H5Objects (groups and datasets) are being leaked. The number of objects that are allocated and then leaked directly corresponds to the number of groups and datasets included during the read test. Good news is the amount of memory leaked does not change as I change the number of entries we are reading so I think it just has something to do with initialization.

tomeichlersmith commented 1 year ago

This seems to be patched in HEAD of HighFive.

I modified one of the HighFive examples and it shows the leak on v2.6.2 but not on HEAD:

diff --git a/src/examples/read_write_dataset_string.cpp b/src/examples/read_write_dataset_string.cpp
index ec1813e..d0d046e 100644
--- a/src/examples/read_write_dataset_string.cpp
+++ b/src/examples/read_write_dataset_string.cpp
@@ -42,7 +42,14 @@ int main(void) {

         // let's write our vector of  string
         dataset.write(string_list);
+    } catch (Exception& err) {
+        // catch and print any HDF5 error
+        std::cerr << err.what() << std::endl;
+    }

+    try {
+        File file(FILE_NAME);
+        DataSet dataset = file.getDataSet(DATASET_NAME);
         // now we read it back
         std::vector<std::string> result_string_list;
         dataset.read(result_string_list);
# inside HighFive
git checkout v2.6.2
# apply changes above
c++ -o read-v2.6.2 -fsanitize=address,undefined -Iinclude -I/usr/include/hdf5/serial/ src/examples/read_write_dataset_string.cpp -lhdf5_serial
git checkout master
c++ -o read-HEAD -fsanitize=address,undefined -Iinclude -I/usr/include/hdf5/serial/ src/examples/read_write_dataset_string.cpp -lhdf5_serial
./read-v2.6.2
# see leak error
./read-HEAD
# don't see leak
tomeichlersmith commented 1 year ago

Since this is a very small memory leak and seems to be patched in future HighFive versions, I am going to close this issue.