Unidata / netcdf-c

Official GitHub repository for netCDF-C libraries and utilities.
BSD 3-Clause "New" or "Revised" License
511 stars 263 forks source link

nc_inq_dimlen reports incorrect length of unlimited dimension after variable data modified #2357

Closed jswhit closed 2 years ago

jswhit commented 2 years ago

See https://github.com/Unidata/netcdf4-python/issues/1166 which includes a simple test program reproduced here

#include <netcdf.h>
#include <stdio.h>
int main() {
    int i, iret, dimidx, dimidt, varid, ncid;
    int dimids[2];
    size_t start[2], count[2], dimlen;
    int data[10];
    iret = nc_create("test_issue1166.nc", NC_NETCDF4, &ncid);
    iret = nc_def_dim(ncid, "x", 10, &dimidx);
    iret = nc_def_dim(ncid, "t", NC_UNLIMITED, &dimidt);
    dimids[0] = dimidt;
    dimids[1] = dimidx;
    iret = nc_def_var(ncid, "v", NC_INT, 2, dimids, &varid);
    start[0]=0;
    start[1]=0;
    count[0]=1;
    count[1]=10;
    for (i = 0; i < 10; i++)
        data[i] = 1;
    iret = nc_put_vara_int(ncid, varid, start, count, data);
    start[0]=1;
    start[1]=0;
    count[0]=1;
    count[1]=10;
    for (i = 0; i < 10; i++)
        data[i] = 2;
    iret = nc_put_vara_int(ncid, varid, start, count, data);
    iret = nc_close(ncid);
    iret = nc_open("test_issue1166.nc", NC_WRITE | NC_NOCLOBBER, &ncid);
    iret = nc_inq_varid(ncid, "v", &varid);
    iret = nc_inq_dimid(ncid, "t", &dimidt);
    start[0]=0;
    start[1]=0;
    count[0]=1;
    count[1]=10;
    for (i = 0; i < 10; i++)
        data[i] = 0;
    iret = nc_put_vara_int(ncid, varid, start, count, data);
    iret = nc_inq_dimlen(ncid, dimidt, &dimlen);
    printf("dim length after write=%lu\n", dimlen);
    iret = nc_close(ncid);
}

with netcdf-c 4.8.1 the dimension length printed is 1, but with netcdf-c 4.7.4 the correct value of 2 is reported. Note that if the last nc_put_vara is commented out (the variable data is not modified), the correct value of 2 is printed.

edwardhartnett commented 2 years ago

OK, I have found the problem here, and there are two.

Firstly, in hdf5var.c we have:

                if (!zero_count && endindex >= dim->len)
                {
                    dim->len = endindex + 1;
                    dim->extended = NC_TRUE;
                }

This is incorrect because dim->len is always 0 for unlimited dimensions. Not until nc4_find_dim_len() is called can we know the length of an unlimited dimension. So instead of dim->len we need to use fdims[d2] which is the actual length of the variable's extent in the unlimited dimension.

Secondly we have this in hdf5internal.c:

#ifndef USE_PARALLEL        
        /* Shortcut for non-parallel operation: if the dim->len is
         * non-zero, it will be set to the correct size. */
        if (dim->len)
        *lenp = dim->len;
#endif

This shortcut only works in simple cases. In more complex cases it will be wrong. For example, in the test code in this issue, the problem is we re-open the file, and write along the unlimited dimension, but we re-write the very first element, rather than adding to the end of the array. If we added to the end, this works, but when we re-write an existing element which is not the last element, then this will be wrong. There are also other cases in which this is wrong, for example if we write one var to 10 along the unlimited dim, then write a different var to 2 along the unlimited dimension, the dim->len will equal 2 not 10.

In fact, I don't think we can take this shortcut at all. We simply always have to check the actual extent of each var along the unlimited dimension, in order to find the max value.

I have a PR to put up shortly. @WardF this probably should go in the 4.9.0 release.

Thanks very much to @jswhit for finding this, reporting it, and providing test code! As so many times in the past, your efforts benefit the entire netCDF scientific community!