Unidata / netcdf-c

Official GitHub repository for netCDF-C libraries and utilities.
BSD 3-Clause "New" or "Revised" License
520 stars 262 forks source link

Segmentation violation during byte range reading of netcdf4 variable #3044

Open abhibaruah opened 3 weeks ago

abhibaruah commented 3 weeks ago

NetCDF version: v4.9.2 OS: Debian 12

I have a netCDF program which tries to read the variable 'water_temp' from the netcdf file located here using byte range reading.

While executing the program, I see a process abortion (segmentation violation) during the call to 'nc_get_var_short' with the following message:

"a.out: /tmp/batserve/B3p1/glnxa64/netcdf/libsrc/httpio.c:261: httpio_get: Assertion `ncbyteslength(http->region) == extent' failed. Abort"

I do not see any errors and the variable is read correctly if I download the file to my local drive.

P.S: The file is ~2 GB in size, and the variable is a 40 x 3251 x 4500 int16 variable. So I am guessing that the large size of the file and the variable creates some issues with byte-range reading.

#include <netcdf.h>
#include <stdio.h>
#include <stdlib.h>

int main() {
    //const char* filename = "GLBv0.08_53X_archMN.1994_01_ts3z.nc";
    const char* filename = "https://data.hycom.org/datasets/GLBv0.08/expt_53.X/meanstd/netcdf/GLBv0.08_53X_archMN.1994_01_ts3z.nc#mode=bytes";

    int ncid, varid;
    int retval;
    nc_type var_type;
    int ndims;
    int dimids[NC_MAX_DIMS];
    size_t dimlen[NC_MAX_DIMS];

    // Open the netCDF file for reading
    if ((retval = nc_open(filename, NC_NOWRITE, &ncid))) {
        fprintf(stderr, "Error: %s\n", nc_strerror(retval));
        return EXIT_FAILURE;
    }

    // Get the variable ID for 'water_temp'
    if ((retval = nc_inq_varid(ncid, "water_temp", &varid))) {
        fprintf(stderr, "Error: %s\n", nc_strerror(retval));
        nc_close(ncid);
        return EXIT_FAILURE;
    }

    // Get the variable type, number of dimensions, and dimension IDs
    if ((retval = nc_inq_var(ncid, varid, NULL, &var_type, &ndims, dimids, NULL))) {
        fprintf(stderr, "Error: %s\n", nc_strerror(retval));
        nc_close(ncid);
        return EXIT_FAILURE;
    }

    // Get the length of each dimension
    for (int i = 0; i < ndims; i++) {
        if ((retval = nc_inq_dimlen(ncid, dimids[i], &dimlen[i]))) {
            fprintf(stderr, "Error: %s\n", nc_strerror(retval));
            nc_close(ncid);
            return EXIT_FAILURE;
        }
    }

    // Calculate total size of the data
    size_t total_size = 1;
    for (int i = 0; i < ndims; i++) {
        total_size *= dimlen[i];
    }

    printf("NC_SHORT! \n");
    void *data = malloc(total_size * sizeof(short));
    if (data && (retval = nc_get_var_short(ncid, varid, (short*)data))) {
    fprintf(stderr, "Error: %s\n", nc_strerror(retval));
    }

    // Close the netCDF file
    if ((retval = nc_close(ncid))) {
        fprintf(stderr, "Error: %s\n", nc_strerror(retval));
    }

}
WardF commented 5 days ago

I've been able to recreate this, and am taking a look at it now; the issue may indeed be the size of the variable; I'm playing around with using nc_get_vara_short() and trying to establish consistent behavior. I will follow up.

WardF commented 4 days ago

Update: I'm seeing inconsistent behavior. sometimes the file works, using nc_get_vara_short(), and sometimes I see the issue you have reported, in that the assertion is tripped and it fails. The interesting thing seems to be that it will either fail immediately, or it will work but take nearly 40 minutes to complete (due to the size of the file). I wonder if there needs to be some failover error handling here, but I'll need to dig into the API to see what's actually happening here before I know for sure what, if anything, we can do about this. @DennisHeimbigner any insight into this?