Open djkirkham opened 7 years ago
After searching further I found this comment https://github.com/Unidata/netcdf4-python/issues/468#issuecomment-145873369, which suggests that variables can have a property to prevent the data being masked if the data matches the default fill value. I've discovered this line in libnetcdf which appears to indicate that in netCDF3 format this property is always enabled. Is this correct?
It does look like nc_inq_var_fill
always returns no_fill=1
for NETCDF3_CLASSIC
and NETCD3_64BIT
files in version 4.5.1. I'm pretty sure it wasn't always that way. Maybe @WardF or @DennisHeimbigner can shed some light. In your example,
>>> print v
<type 'netCDF4._netCDF4.Variable'>
int32 x(x)
unlimited dimensions:
current shape = (3,)
filling off
Using ds.set_fill_on()
doesn't seem to help - the no_fill
flag is still set. For NETCDF4
and NETCDF4_CLASSIC
files, the no_fill
flag is not set.
Reading the docs more closely it seems that nc_inq_var_fill
only works for NETCDF4
and NETCDF4_CLASSIC
files. Although it is possible to turn fillling on and off using nc_set_fill
for NETCDF3
files, I don't yet see how to determine whether filling is enabled. RIght now the python interface is incorrectly assuming that filling is always off for NETCDF3
files since nc_inq_var_fill
always returns no_fill=1
.
Pull request #726 implements a workaround - it assumes all NETCDF3 files have filling turned on when deciding whether to use _FillValue to return masked arrays. A true fix for this should be in netcdf-c 4.5.1 - right now the workaround is wrapped with a version < 4.5.1 if test.
@djkirkham, if this is an acceptable short-term fix I'll go ahead and merge.
@jswhit Thanks for responding so quickly to this, but personally I'd prefer to wait for the true fix in netcdf-c rather than merging this. As a bit of background, I'm working on a library which allows users to read and analyse files in netCDF format (as well as others). I discovered this issue due to failures in some of our automated tests which were writing and then reading a netCDF3 classic file, and failing because of the lack of a mask. We've now patched those tests to set the fill value explicitly as a workaround. So this issue isn't really blocking us.
Although I suspect our user base is more likely to use netCDF3 files without _NoFill
than those with, I'm wary about merging a change which essentially shifts the bug elsewhere. Additionally, currently it's possible to get around the issue by setting the fill value explicitly on write. If this change is merged there will be no equivalent workaround for writing variables with _NoFill
if they are to be read with netCDF4-python.
I see your point. I guess it's a matter of whether the bug we are swapping in is less onerous than the one that exists.
Accessing data from a variable with no explicit
_FillValue
in a netCDF3 Dataset returns a filled ndarray rather than a masked array:Is this intended behaviour? Calling
ncdump
on the dataset shows the points as masked.