Unidata / netcdf4-python

netcdf4-python: python/numpy interface to the netCDF C library
http://unidata.github.io/netcdf4-python
MIT License
757 stars 264 forks source link

2D string variable update causes HDF Error when rereading file #1366

Open robin-cls opened 1 month ago

robin-cls commented 1 month ago

Hello,

I encountered a problem when trying to save a string variable. The goal is to update part of a dataset, so I am using a slice to select relevant part of a 2D-string table, and then assign the new values. While it works well for integer and floating variables, the 'partial' update of a string variable does not go well and raises an HDFError when rereading (see the image after the reproducing steps).

The problem might be linked to the combination of the extensible dimension feature and the 2D case because:

Here are the steps to reproduce:

with netCDF4.Dataset('broken.nc', mode='w') as handler:
    handler.createDimension("dim_0", None)
    handler.createDimension("dim_1", 5)
    handler.createVariable('var_str', str, ('dim_0', 'dim_1'), fill_value='no_data')

    handler["var_str"][2:5, 1:4] = np.full((3, 3), fill_value='foo', dtype=object)

# Error appears when triggering a netcdf close. Something might be getting corrupted somewhere
with netCDF4.Dataset('broken.nc', mode='r') as handler:
    print(handler["var_str"][...])

image

I work in a Conda environment installed on RHEL8 with : python=3.11 h5netcdf=1.2.0 libnetcdf=4.9.2 netcdf4=1.7.1

jswhit commented 1 month ago

since it works if you used fixed dimensions, it's likely a bug in the netcdf-c lib

pjpetersik commented 1 month ago

I recently got similar errors, too. They appeared when I attempted to upgrade the netcdf4 version to 1.7.1. Before, my code was running on version 1.5.8 without any errors.

jswhit commented 1 month ago

netcdf4-python 1.5.8 wheels used an earlier version of the C lib (nothing in the python interface for vlen str variables has changed)

robin-cls commented 1 month ago

Should I reopen this issue in the netcdf-c repository instead ?

jswhit commented 1 month ago

I think that would be a good idea - especially if you could translate your example into C and include that in the github issue.

jswhit commented 1 month ago

you don't have to close this issue - just link it to the one in the netcdf-c repo