HDFGroup / vol-cache

HDF5 Cache VOL connector for caching data on fast storage layers and moving data asynchronously to the parallel file system to hide I/O overhead.
https://vol-cache.readthedocs.io
BSD 3-Clause "New" or "Revised" License
16 stars 8 forks source link

valgrind errors when running test_dataset #16

Closed yzanhua closed 9 months ago

yzanhua commented 1 year ago

Summary

When running the tests/test_dataset.cpp testcase, valgrind reported some memory errors including "Invalid read of size 4", "Invalid write of size 8", "memory leak", etc.

How to reproduce:

% echo $HDF5_VOL_CONNECTOR
cache_ext config=cache_1.cfg;under_vol=512;under_info={under_vol=0;under_info={}}

% make checkmem
HDF5_CACHE_WR=yes \
mpirun -n 2 \
valgrind --log-file=v-%p.txt --leak-check=full --track-origins=yes \
./test
****HDF5 Testing Dataset*****
=============================================
 Buf dim: 2048 x 2048
   nproc: 2
=============================================
Creating file parallel_file.h5
Creating group 0
Creating dataset dset_test
Writing dataset dset_test
------- EXT CACHE VOL DATASET Write
------- EXT CACHE VOL DATASET Write
Closing dataset dset_test
Closing group 0
Closing file parallel_file.h5
====================

Valgrind output files

Below are the valgrind output files from two MPI processes. v-252365.txt v-252366.txt

zhenghh04 commented 10 months ago

I found out that this is caused by too large HDF5_CACHE_WRITE_BUFFER_SIZE. Please reduce that and try again.