Unidata / netcdf-c

Official GitHub repository for netCDF-C libraries and utilities.
BSD 3-Clause "New" or "Revised" License
511 stars 263 forks source link

Backward-incompatible nc_inq_var_filter() change in netCDF 4.7.4 #1693

Closed czender closed 2 years ago

czender commented 4 years ago

Hola compadres,

I am trying to identify and workaround a backward-incompatible behavior change in 4.7.4. Please correct me if I am wrong about the following, @DennisHeimbigner and others. As of 4.7.4, nc_inq_var_filter() called on a chunked and shuffled though not compressed variable in a netCDF4 dataset (CDL below) returns error -136, "NetCDF: Filter error: filter not defined for variable". Prior to 4.7.4, nc_inq_var_filter() returns NC_NOERR on this variable, and sets the filter ID to 0. This new behavior in 4.7.4 breaks the strategy that NCO uses to identify which filters are used on a variable. Currently NCO uses nc_inq_var_filter()on all chunked variables to assess whether any filters have been employed on them. If the returned filter ID != 0 then NCO delves deeper into the filter. As this example points out, that strategy no longer works on variables that are chunked and shuffled but not compressed. What is the recommended strategy to learn whether a variable has had a filter applied?

This also serves as notice of the backwards incompatibility. I am happy to modify NCO to make calls that depend on the netCDF library version if necessary, though I would prefer one strategy that works with all netCDF libraries (greater than 4.6.0, when nc_inq_var_filter() was introduced, in this particular case).

netcdf foo {
dimensions:
    lat = 2 ;
variables:
    float lat(lat) ;
        lat:units = "degrees_north" ;
        lat:_Storage = "chunked" ;
        lat:_ChunkSizes = 2 ;
        lat:_Shuffle = "true" ;
        lat:_Endianness = "little" ;

// global attributes:
        :history = "Wed Apr  8 10:17:42 2020: ncks -O -4 -L 0 --cnk_dmn lat,1 -v /lat /Users/zender/nco/data/in_grp.nc /Users/zender/foo.nc\nHistory global attribute.\n" ;
        :NCO = "netCDF Operators version 4.9.3-alpha05 (Homepage = http://nco.sf.net, Code = http://github.com/nco/nco)" ;
        :_NCProperties = "version=2,netcdf=4.7.4,hdf5=1.10.6" ;
        :_SuperblockVersion = 0 ;
        :_IsNetcdf4 = 1 ;
        :_Format = "netCDF-4" ;
data:

 lat = -90, 90 ;
}

Interrogating this file with NCO using netCDF 4.7.3:

zender@firn:~$ ncks -D 1 -C -m --trd --hdn -v /lat ~/foo.nc
lat: type NC_FLOAT, 1 dimension, 1 attribute, compressed? no, chunked? yes, packed? no
lat size (RAM) = 2*sizeof(NC_FLOAT) = 2*4 = 8 bytes
lat dimension 0: lat, size = 2 NC_FLOAT, chunksize = 2 (Coordinate is lat)
lat attribute 0: units, size = 13 NC_CHAR, value = degrees_north
lat attribute 1: _Storage, size = 7 NC_CHAR, value = chunked
lat attribute 2: _ChunkSizes, size = 1 NC_INT, value = 2
lat attribute 3: _Filter, size = 2 NC_CHAR, value = 0,
lat attribute 4: _Shuffle, size = 4 NC_CHAR, value = true
lat attribute 5: _Endianness, size = 6 NC_CHAR, value = little

Interrogate this file with NCO using netCDF 4.7.4:

zender@sastrugi:~$ ncks -D 1 -C -m --trd --hdn -v /lat ~/foo.nc
lat: type NC_FLOAT, 1 dimension, 1 attribute, compressed? no, chunked? yes, packed? no
lat size (RAM) = 2*sizeof(NC_FLOAT) = 2*4 = 8 bytes
lat dimension 0: lat, size = 2 NC_FLOAT, chunksize = 2 (Coordinate is lat)
nco_err_exit(): ERROR Short NCO-generated message (usually name of function that triggered error): nco_inq_var_filter()
nco_err_exit(): ERROR Error code is -136. Translation into English with nc_strerror(-136) is "NetCDF: Filter error: filter not defined for variable"
nco_err_exit(): ERROR NCO will now exit with system call exit(EXIT_FAILURE)
edwardhartnett commented 4 years ago

This is somewhat similar to #1691

In the case of nc_inq_var_deflate() and nc_inq_var_szip() the idea of supporting them for classic files was that zlib and szip can conceivably be used in other, future dispatch layers. Therefore it make sense to allow them to report no compression with classic files, instead of error.

Are filters the same?

I would argue yes.

I was working with the filters on the CCR project and they are actually very general. They just process a stream of bytes and yield a stream of bytes. It's easy to imagine a future dispatch layer that allowed the use of existing HDF5 filters, even if that dispatch layer doesn't use HDF5.

DennisHeimbigner commented 4 years ago

I now agree. I need to check how inq_var_filter functions operate.

edwardhartnett commented 4 years ago

I guess ideally instead of the NOTNC4 versions of the generic functions we provide for the dispacth layers, we should have a SHOWOFF version of the inq function which shows the values as off (i.e. no compression, no filter, etc.) and then dispatch layers that don't implement filters can use the SHOWIOFF versions instead of the NOTNC4 versions.

Or we could just change the NOITNC4 versions to do the same, and that would be easy.

DennisHeimbigner commented 4 years ago

Are there other features for which we should be doing this? Chunking, etc.

czender commented 4 years ago

FWIW, NCO uses wrappers that treats netCDF4 and netCDF3 datasets as similarly as possible, e.g.,

int nco_inq_var_chunking
(const int nc_id, /* [ID] netCDF ID */
 const int var_id, /* [ID] Variable ID */
 int * const srg_typ, /* [enm] Storage type */
 size_t * const cnk_sz) /* [nbr] Chunk sizes */
{
  /* Purpose: Wrapper for nc_inq_var_chunking() */
  /* NB: netCDF chunking inquire function only works on netCDF4 files
     NCO wrapper works on netCDF3 and netCDF4 files */
  int rcd;
  int fl_fmt; /* [enm] Input file format */
  rcd=nco_inq_format(nc_id,&fl_fmt);
  if(fl_fmt == NC_FORMAT_NETCDF4 || fl_fmt == NC_FORMAT_NETCDF4_CLASSIC){
    rcd=nc_inq_var_chunking(nc_id,var_id,srg_typ,cnk_sz);
  }else{ /* !netCDF4 */
    /* Defensive programming */
    *srg_typ=NC_CONTIGUOUS;
  } /* !netCDF4 */
  if(rcd != NC_NOERR) nco_err_exit(rcd,"nco_inq_var_chunking()");
  return rcd;
} /* end nco_inq_var_chunking() */

It's nice to be able to use the netCDF4 API and get reasonable answers when applied to netCDF3 files. Of course this exact approach may not be suitable for libnetcdf, just letting you know how some of userspace works around these issues.

edwardhartnett commented 4 years ago

OK, I have a test that demonstrates that nc_inq_var_chunking() works fine on all formats (returning NC_CONTIGUOUS for all classic formats, as it should).

I am going to leave nc_inq_var_filter() to @DennisHeimbigner since it is your code. This issue clearly involved nc_inq_var_filter().

In terms of implementing a SHOWOFF in addition to a NOTNC4 version of some dispatch functions, that is complicated by the fact that the classic dispatch table does not even include the netCDF-4 functions. I'm not sure how that works actually. Do we just count on the code to never call a netcdf-4 function for a classic file? I believe that is the answer.

In that case, I don't know how to handle nc_inq_var_deflate() using the dispatch table. I believe the answer is that nc3dispatch.h should be changed so classic files have a full dispatch table, instead of an abbreviated one, and that we use the NOTNC4 and SHOWOFF functions in that case. But that is beyond the scope of this issue, perhaps.

I will submit a PR which tests nc_inq_var_chunking(), demonstrating that it is working correctly.

DennisHeimbigner commented 4 years ago

Interesting. You may recall that that old issue about fixing the dispatch table had removing the netcdf4 dependency as a task. In fact I thought we had done this. I am going to create a PR to do this, then we can use the SHOWOFF (or perhaps IGNORE) dispatch table entries.

DennisHeimbigner commented 4 years ago

I take it back. Apparently this was done some time ago. So for example, the libsrc dispatch table has entries for e.g. def_var_deflate.

czender commented 4 years ago

FYI I just implmented an NCO workaround to this nc_inq_var_filter() backwards-incompatibility in netCDF 4.7.4 and in netCDF 4.8.0-development. I would be happy to remove the NCO workaround for 4.8.0-development should its behavior be reverted to the pre-4.7.4 behavior. Please LMK if/when the 4.8.0-development behavior of nc_inq_var_filter() changes.

edwardhartnett commented 4 years ago

If I'm not mistaken this behavior has been reverted in master.

edwardhartnett commented 2 years ago

I believe this issue can be closed.