Closed barbarapirscher closed 2 years ago
Thanks for the report, I can confirm with your script as well as this simple test:
import netCDF4
import numpy as np
nc = netCDF4.Dataset('test_issue1166.nc','w')
t = nc.createDimension('t',None)
x = nc.createDimension('x',100)
v = nc.createVariable('v',np.float,('t','x'))
v[0,:] = np.ones(100)
v[1,:] = 2*np.ones(100)
nc.close()
nc = netCDF4.Dataset('test_issue1166.nc','r+')
t = nc.dimensions['t']
print(len(t))
v[0,:] = np.zeros(100)
print(len(t))
nc.close()
nc = netCDF4.Dataset('test_issue1166.nc')
t = nc.dimensions['t']
print(len(t))
nc.close()
2
1
2
Looks as if the data in the file is correct, since if you close and re-open the file after modifying the variable data the dimension length is correct. However, the dimension length is reported incorrectly just after the variable is modified.
This appears to be an issue with the underlying C lib. Here's a simple C test program to illustrate the bug:
#include <netcdf.h>
#include <stdio.h>
int main() {
int i, iret, dimidx, dimidt, varid, ncid;
int dimids[2];
size_t start[2], count[2], dimlen;
int data[10];
iret = nc_create("test_issue1166.nc", NC_NETCDF4, &ncid);
iret = nc_def_dim(ncid, "x", 10, &dimidx);
iret = nc_def_dim(ncid, "t", NC_UNLIMITED, &dimidt);
dimids[0] = dimidt;
dimids[1] = dimidx;
iret = nc_def_var(ncid, "v", NC_INT, 2, dimids, &varid);
start[0]=0;
start[1]=0;
count[0]=1;
count[1]=10;
for (i = 0; i < 10; i++)
data[i] = 1;
iret = nc_put_vara_int(ncid, varid, start, count, data);
start[0]=1;
start[1]=0;
count[0]=1;
count[1]=10;
for (i = 0; i < 10; i++)
data[i] = 2;
iret = nc_put_vara_int(ncid, varid, start, count, data);
iret = nc_close(ncid);
iret = nc_open("test_issue1166.nc", NC_WRITE | NC_NOCLOBBER, &ncid);
iret = nc_inq_varid(ncid, "v", &varid);
iret = nc_inq_dimid(ncid, "t", &dimidt);
start[0]=0;
start[1]=0;
count[0]=1;
count[1]=10;
for (i = 0; i < 10; i++)
data[i] = 0;
iret = nc_put_vara_int(ncid, varid, start, count, data);
iret = nc_inq_dimlen(ncid, dimidt, &dimlen);
printf("dim length after write=%lu\n", dimlen);
iret = nc_close(ncid);
}
With the latest version of netcdf-c (4.8.1) running the yields:
dim length after write=1
while ncdump on the file shows the dimension has length 2. Running the test program with netcdf-c 4.7.4 produces the correct answer (2).
I suspect the different answers you are getting with different versions of the python interface is because different versions of the C library are linked.
This should now be fixed in netcdf-c 4.9.0 (which the netcdf4-python 1.6.0 wheels use)
Version: netCDF-4 python, versions 1.5.7 and 1.5.8 The code works correctly for versions 1.5.4 to 1.5.6
Environment: Python3.9 numpy version 1.21.6
Description: Editing the content of a dataset variable, changes the length of the unlimited dimension. I attached the (tarred) netCDF-file, where I observed the problem.
Code: import numpy as np from netCDF4 import Dataset
filename ='wrfbdy_d01__sel' var_name = 'DUST_1_BXS' data = Dataset(filename, 'r+') print(data.dimensions['Time']) # --> <class 'netCDF4._netCDF4.Dimension'> (unlimited): name = 'Time', size = 2 for all netCDF-4 python versions
modif_var = data.variables[var_name] increment = np.ones(modif_var[0, ...].shape) * 0.2 data.variables[var_name][0, ...] = modif_var[0, ...] + increment
print(data.dimensions['Time'])
# --> <class 'netCDF4._netCDF4.Dimension'> (unlimited): name = 'Time', size = 1 for netCDF4 versions 1.5.7 and 1.5.8 # --> <class 'netCDF4._netCDF4.Dimension'> (unlimited): name = 'Time', size = 2 for netCDF4 versions 1.5.4 to 1.5.6
wrfbdy_d01__sel.tar.gz )