Closed cehbrecht closed 8 months ago
The error is caused by this check: https://github.com/roocs/clisops/blob/0b94f66a2628d0ee63a7be8a786264da7175a742/clisops/ops/base_operation.py#L93-L96
This code was added with the regrid operator in release 0.12. So, it is not an issue with cf_xarray
dependencies.
@sol1105 I have added a quick-fix in this PR #309. It converts the exception into a warning message. What would be a meaningful solution?
In #243 I updated your xarray
_FillValue
workaround (_remove_redundant_fill_values
(issues #198 / #224 )). My changes cause this problem.
Since xarray adds NaN
as _FillValue
, when no _FillValue
is defined:
if the source data has no missing_value
/_FillValue
attribute defined: set it to None
so xarray
would not set it to NaN
if both are defined, check if they are the same (it is a data quality check basically that I never thought would trigger for C3S data).
In the case of this dataset, _FillValue
is defined as a float
(as it should be), while missing_value
is defined as double
.
This is still a flaw in the source data, but none that should prevent processing.
One could use numpy.isclose
in the check and set both missing_value
/_FillValue
attributes to the same data type as has the data.
Update: When xarray
opens such a dataset it prints a warning:
xarray/conventions.py:543: SerializationWarning: variable 'tas' has multiple fill values {1e+20, 1e+20}, decoding all values to NaN.
After looking into the xarray
code, when writing out the data to disk...
xarray
casts the _FillValue
/ missing_value
attributes to the same data type as the dataxarray
will fail encoding when the values of both attributes are "not close" (rtol=1e-5, atol=1e-8)So I guess we can just set if _FillValue != missing_value: FillValue = missing_value
with an issued warning and let xarray handle the dtype
since it properly decodes the missing values.
fixed by PR #309
Description
I have run a subsetting operation on a CMIP6 dataset:
c3s-cmip6.CMIP.NCAR.CESM2-WACCM.historical.r1i1p1f1.day.tas.gn.v20190227
CLISOPS failed with the following error message:
This error does not happen for all datasets. Our tests didn't cover this case. I suppose the error is caused by the newer
xarray
and/orcf_xarray
version used in clisops v.12.0.This error does not appear in clisops v0.10.1 used by rook v0.11.0.
What I Did
I have reproduced the error with the following notebook:
https://nbviewer.org/github/roocs/rooki/blob/master/notebooks/errors/cds-error-2023-11-28-subset-cmip6.ipynb
It is using
rooki
:An example file for testing can be this one: https://data.mips.copernicus-climate.eu/thredds/fileServer/esg_c3s-cmip6/CMIP/NCAR/CESM2-WACCM/historical/r1i1p1f1/day/tas/gn/v20190227/tas_day_CESM2-WACCM_historical_r1i1p1f1_gn_20000101-20091231.nc