Unidata / netcdf-c

Official GitHub repository for netCDF-C libraries and utilities.
BSD 3-Clause "New" or "Revised" License
511 stars 263 forks source link

SegV while opening a netcdf file twice (at the same time) and calling nc_get_var_string on a string variable #2510

Closed nvishnoiMW closed 2 years ago

nvishnoiMW commented 2 years ago

Hi everyone,

I am seeing a SegV with NetCDF 4.8.1 and HDF5 1.10.8 (note that the crash does not happen with NetCDF 4.8.1 and HDF 1.8.12) when trying to open and read a string variable twice while using different file handles. Please see the attached script that would create a .nc (NetCDF4) file and would try to read data from it. I had to rename it to main.txt from main.c otherwise I was unable to attach it.

I can see the crash on Windows and Linux. Stack trace on Windows from my application:

[ 0] 0x00007ff83b968dcf H5F_addr_decode+00000111 at hdf5.dll+953807 (no debugging symbols found) [ 1] 0x00007ff83bb7f01b H5Tvlen_set_loc+00001243 at hdf5.dll+3141659 (no debugging symbols found) [ 2] 0x00007ff83bb72fee H5Tconv_vlen+00001454 at hdf5.dll+3092462 (no debugging symbols found) [ 3] 0x00007ff83bb05537 H5T_convert+00000391 at hdf5.dll+2643255 (no debugging symbols found) [ 4] 0x00007ff83b928826 H5D_get_create_plist+00001398 at hdf5.dll+690214 (no debugging symbols found) [ 5] 0x00007ff83b8f45ab H5Dget_create_plist+00000299 at hdf5.dll+476587 (no debugging symbols found) [ 6] 0x00007ff8470ddc14 nc4_H5Fopen+00000356 at netcdf.dll+384020 (no debugging symbols found) [ 7] 0x00007ff8470dbd18 NC4_put_vlen_element+00003000 at netcdf.dll+376088 (no debugging symbols found) [ 8] 0x00007ff8470dfd0c NC4_HDF5_inq_var_all+00000092 at netcdf.dll+392460 (no debugging symbols found) [ 9] 0x00007ff84708d79d nc_inq_var+00000189 at netcdf.dll+55197 (no debugging symbols found)

Can you please confirm if this workflow is not supported or this crash is indeed a bug?

Thanks, Nalini

main.txt

WardF commented 2 years ago

I can confirm that I'm able to replicate this issue with HDF5 1.10.8, using netCDF-C v4.8.1, v4.9.0, and the main development branch. I suspect this is an issue in HDF5 1.10.8, since the crash doesn't occur (as you observe) in the 1.8.x versions, nor does it occur when using HDF5 .1.12.x.

DennisHeimbigner commented 2 years ago

I ran the test with valgrind on ubuntu21. Valgrind reported a read after free error somewhere deep in the HDF5 library.

WardF commented 2 years ago

This sounds like an issue that may need to be documented on our end, and raised with the HDF group. Thanks for reporting it @nvishnoiMW

nvishnoiMW commented 2 years ago

Thank you @WardF and @DennisHeimbigner for your super quick responses. I have raised an issue with the HDF group. Hopefully, it will be fixed soon!

Thanks, Nalini

nvishnoiMW commented 2 years ago

Hi @WardF and @DennisHeimbigner,

I just heard back from the HDF group asking us to try HDF5 1.10.9 (https://portal.hdfgroup.org/display/support/HDF5+1.10.9#files) to see if the issue could be reproduced. They suspect they have fixed something similar but can't recall the exact details. It will take a bit of time at my end to hook up NetCDF with HDF5 1.10.9 library but I wanted to mention the suggestion here in case it is easier for you to test the standalone with NetCDF and HDF5 1.10.9.

Many thanks, Nalini

DennisHeimbigner commented 2 years ago

There are some pure hdf5 test programs in netcdf-c/h5_test that you might be able to modify to create your program.

On 9/22/2022 10:52 AM, Nalini Vishnoi wrote:

Hi @WardF https://github.com/WardF and @DennisHeimbigner https://github.com/DennisHeimbigner,

I just heard back from the HDF group asking us to try HDF5 1.10.9 to see if the issue could be reproduced. They suspect they have fixed something similar but can't recall the exact details. It will take a bit of time at my end to hook up NetCDF with HDF5 1.10.9 library but I wanted to mention the suggestion here in case it is easier for you to test the standalone with NetCDF and HDF5 1.10.9.

Many thanks, Nalini

— Reply to this email directly, view it on GitHub https://github.com/Unidata/netcdf-c/issues/2510#issuecomment-1255295465, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG47W42TYVXXKEONOWAWTTV7SFFJANCNFSM6AAAAAAQRFBDKA. You are receiving this because you were mentioned.Message ID: @.***>

WardF commented 2 years ago

Thanks @nvishnoiMW, updating our test environments to use 1.10.9 is pretty straightforward, I'll give it a test.

WardF commented 2 years ago

So unfortunately I am seeing the same issue with 1.10.9. I've dug in a bit, and ran the test using memory address sanitizing (via CFLAGS="-fsanitize=address -fno-omit-frame-pointer"), and I see the following:

wfisher@jellydev:~/Desktop/tmp$ ./a.out 
status after close = 0
=================================================================
==202172==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xffff8cc046f1 at pc 0xffff95b7aa8c bp 0xffffc484d4b0 sp 0xffffc484d508
WRITE of size 8 at 0xffff8cc046f1 thread T0
    #0 0xffff95b7aa88 in __interceptor_memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827
    #1 0xffff9499cdc0 in H5D__scatter_mem (/home/wfisher/environments/local-ncmain-1.10.9-sanitized/lib/libhdf5.so.103+0x2ccdc0)
    #2 0xffff94978088 in H5D__fill (/home/wfisher/environments/local-ncmain-1.10.9-sanitized/lib/libhdf5.so.103+0x2a8088)
    #3 0xffff94993508 in H5D__read (/home/wfisher/environments/local-ncmain-1.10.9-sanitized/lib/libhdf5.so.103+0x2c3508)
    #4 0xffff94994728 in H5Dread (/home/wfisher/environments/loca

This at least confirms that we have something going on down in the hdf5 library, but unfortunately, it looks like it indicates a bug in the latest version of the 1.10.x line of hdf5.

nvishnoiMW commented 2 years ago

Thank you @DennisHeimbigner and @WardF! @WardF - you are amazing! :) Thank you for confirming that the issue exists in HDF5 1.10.9. I have conveyed the same to the HDF group and they have created a bug report HDFFV-11335 regarding this crash.

Thank you once again for your help with this investigation. Nalini