Open joleenf opened 3 months ago
I'm not sure I agree that there should be no _FillValue
. The argument in this xarray issue (https://github.com/pydata/xarray/issues/1865) if I remember correctly is more about matching the CF standard. In CF a coordinate variable is a 1D variable that matches the name of a dimension. As we discussed on slack, it makes sense (as mentioned in the xarray issue about the CF standard) that you can't have fill values on a coordinate 1D variable. You can't have a pixel of data that has a "location" of (NaN, NaN). It just doesn't make sense. BUT our 2D lon/lats in CF-land I don't think are technically coordinate variables at least as far as the missing value concern...is concerned. This section of the CF docs makes me think that it is expected:
http://cfconventions.org/cf-conventions/v1.6.0/cf-conventions.html#reduced-horizontal-grid
Storing this type of gridded data in two-dimensional arrays wastes space, and results in the presence of missing values in the 2D coordinate variables.
So on this page:
http://cfconventions.org/cf-conventions/cf-conventions.html#terminology
auxiliary coordinate variable
Any netCDF variable that contains coordinate data, but is not a coordinate variable (in the sense of that term defined by the NUG and used by this standard - see below). Unlike coordinate variables, there is no relationship between the name of an auxiliary coordinate variable and the name(s) of its dimension(s).
Which is used:
http://cfconventions.org/cf-conventions/cf-conventions.html#missing-data
Missing data is allowed in data variables and auxiliary coordinate variables. Generic applications should treat the data as missing where any auxiliary coordinate variables have missing values; special-purpose applications might be able to make use of the data. Missing data is not allowed in coordinate variables.
So if our 2D lon/lats are considered "auxiliary" then they're fine to have a _FillValue
in CF.
Anyway, my opinion is that the lon/lat 2D arrays should have a _FillValue
or at the very least a valid_range
.
The cf_writer adds a _FillValue to the final netCDF output for the lat/lon coordinates. In this case, adapted from an abi_fixed_grid, the dims are
("y", "x")
while the coords are("latitude", "longitude")
even so, the coordinates should not contain a _FillValue upon writing to the netCDF file. https://github.com/pydata/xarray/issues/1598Reproduce:
Though the original attributes do not contain a _FillValue, the resulting netCDF does, the code above prints:
but an ncdump -h on test.nc shows the addition of the _FillValue:
However, it is possible to add encoding to save_datasets to save the lat/lon without _FillValue:
scn.save_datasets(filename=test.nc, encoding={"latitude": {"_FillValue": None}, "longitude": {"_FillValue": None}})
It seems that the encoding for writing netCDF data should include this somewhere in save_datasets rather than being typed explicitly.