Closed GoogleCodeExporter closed 8 years ago
The netcdf C library uses a default fill value of 255 for uint8 (set by
NC_FILL_UBTYE in netcdf.h). If you don't want a fill value, you must set the
fill_value keyword to False when you create the variable with the
createVariable Dataset method. If you don't do this, then the netcdf C library
will use the default fill value, and you should only use the range 0-254 for
scaling.
Original comment by whitaker.jeffrey@gmail.com
on 27 Nov 2013 at 4:47
[deleted comment]
If you didn't create this dataset, then a workaround would be to set
var.set_auto_maskandscale(False), and then do the scaling manually.
Original comment by whitaker.jeffrey@gmail.com
on 27 Nov 2013 at 4:52
Yes I have the problem when reading the variable, I didn't create it.
For the moment I have implemented your workaround, but is it not possible to
modify set_auto_maskandscale behavior to return a masked array only if the
variable has a fill or missing value attribute? For me a variable only scaled
but without fill value should be only automatically scaled into a
numpy.ndarray, not into a masked array.
What do you think about that?
Original comment by DeM...@gmail.com
on 28 Nov 2013 at 9:01
Sounds reasonable, but... Technically speaking, every netcdf variable has a
_FillValue, since the library sets one by default. That is, unless
nc_def_var_fill was used to explicitly disable filling.
Ultimately, I think your data provider was wrong to provide a dataset with
valid data equal to the fill value. They should have disabled filling.
Original comment by whitaker.jeffrey@gmail.com
on 28 Nov 2013 at 3:47
I just realized that I was not checking to see if filling was disabled before
masking data equal to the default _FillValue. This is now fixed. Can you try
updating from SVN? It's possible your data provider did disable filling, in
which case you should get the desired result now.
Original comment by whitaker.jeffrey@gmail.com
on 28 Nov 2013 at 3:50
Ok thanks for the explanations, which is also what I found in the NetCDF4
documentation, it is more clear for me now. I write/read NetCDF since a long
time but never get this point about default fillvalue. So a variable has a fill
value even if it doesn't have an explicit _FillValue attribute (and so by
default you cannot use the full range of the variable except by setting the
fill mode). This seems not very intuitive to me and I think a lot of files on
the world doesn't follow this rule... but anyway this is not your problem
because this is the NetCDF specifications :-)
Note that it seems there is an exception for byte variables:
http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-c/Fill-Values.html#Fill-
Values
"If you need a fill value for a byte variable, it is recommended that you
explicitly define an appropriate _FillValue attribute, as generic utilities
such as ncdump will not assume a default fill value for byte variables."
Explained here too:
http://www.unidata.ucar.edu/software/netcdf/docs/known_problems.html#ncdump_ubyt
e_fill
"There should be no default fill values when reading any byte type, signed or
unsigned, because the byte ranges are too small to assume one of the values
should appear as a missing value unless a _FillValue attribute is set
explicitly."
I suppose you didn't implement this exception because my test was on a byte
variable?
Unfortunately we can't update all our data provider and have tons of existing
files supposing there is no fill value if the attribute is missing, so for the
moment I will stick to the workaround.
Thanks a lot for your great NetCDF4-Python library, it's very useful, and very
well designed!
Original comment by DeM...@gmail.com
on 29 Nov 2013 at 3:50
I forgot: is there an easy way to dump the fill mode information of each
variable with ncdump or with your ncinfo?
Original comment by DeM...@gmail.com
on 29 Nov 2013 at 3:57
I had not seen that exception for byte variables in the docs - thank you for
pointing that out. I have now implemented that exception in svn, so no default
fill_value is assumed for signed or unsigned byte data dtypes.
ncdump does not print fill mode information. I just modified ncinfo so it will
print fill mode info when you do 'ncinfo -v <varname> <filename>'.
Original comment by whitaker.jeffrey@gmail.com
on 29 Nov 2013 at 4:33
Original comment by whitaker.jeffrey@gmail.com
on 26 Feb 2014 at 2:04
Original issue reported on code.google.com by
DeM...@gmail.com
on 27 Nov 2013 at 8:49