Open klindsay28 opened 4 years ago
Looking at the two files, SFCO2_OCN_atm_piControl_00_mon.nc
has time units of days since 0001-01-01
and FG_CO2_ocn_piControl_00_mon.nc
has time units of days since 0000-01-01
. I was under the impression that cftime
didn't like having a year 0, but this appears to be a red herring as
import xarray as xr
import esmlab
dir = '/glade/work/klindsay/analysis/CESM2_coup_carb_cycle_JAMES/tseries'
# fname = 'SFCO2_OCN_atm_piControl_00_mon.nc'
fname = 'FG_CO2_ocn_piControl_00_mon.nc'
path = '/'.join([dir, fname])
ds = xr.open_dataset(path, decode_times=False)
ds.time.values = ds.time.values - 365
ds.time.attrs['units'] = "days since 0001-01-01"
ds = xr.decode_cf(ds)
ds_anom = esmlab.anomaly(ds, clim_freq='mon')
yields the same traceback:
/glade/work/mlevy/miniconda3/envs/cesm2-marbl/lib/python3.7/site-packages/esmlab/core.py:291: RuntimeWarning: invalid value encountered in greater
calendar=self.time_attrs['calendar'],
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/glade/work/mlevy/miniconda3/envs/cesm2-marbl/lib/python3.7/site-packages/esmlab/core.py", line 730, in anomaly
slice_mon_clim_time=slice_mon_clim_time
File "/glade/work/mlevy/miniconda3/envs/cesm2-marbl/lib/python3.7/contextlib.py", line 74, in inner
return func(*args, **kwds)
File "/glade/work/mlevy/miniconda3/envs/cesm2-marbl/lib/python3.7/site-packages/esmlab/core.py", line 515, in compute_mon_anomaly
return self.restore_dataset(computed_dset, attrs=attrs)
File "/glade/work/mlevy/miniconda3/envs/cesm2-marbl/lib/python3.7/site-packages/esmlab/core.py", line 291, in restore_dataset
calendar=self.time_attrs['calendar'],
File "cftime/_cftime.pyx", line 320, in cftime._cftime.num2date
File "cftime/_cftime.pyx", line 869, in cftime._cftime.utime.num2date
File "cftime/_cftime.pyx", line 564, in cftime._cftime.DateFromJulianDay
File "cftime/_cftime.pyx", line 1378, in cftime._cftime.DatetimeNoLeap.__init__
File "cftime/_cftime.pyx", line 1637, in cftime._cftime.assert_valid_date
ValueError: invalid second provided in cftime.DatetimeNoLeap(-5883517, 2, 27, 0, 0, -2147483648, -2147483648, 5, 58)
What's odd is that I can't find the index of ds.time
corresponding to cftime.DatetimeNoLeap(-5883517, 2, 27, 0, 0, -2147483648, -2147483648, 5, 58)
:
for i in range(14400):
sec = ds.time.values[i].second
if sec != 0:
print(f'{i}: {ds.time.values[i]}')
outputs
0: 0001-01-16 12:59:59
So I guess an internal esmlab
computation is creating a bad time axis?
If I change fname
to fname = 'TOTECOSYSC_lnd_esm-piControl_00_mon.nc'
, I see the same error, and time:units
in this file is days since 0001-01-01
.
So, @mnlevy1981 , I also think there is problem in esmlab
.
The -2147483648
values lead me to suspect that something isn't getting initialized. I am not able to follow the esmlab code, so I don't know where the problem resides.
I think @andersy005 is going to need to look into this one, but I just want to point out that your block of code runs fine if you do not decode time when opening the dataset:
import xarray as xr
import esmlab
dir = '/glade/work/klindsay/analysis/CESM2_coup_carb_cycle_JAMES/tseries'
fname = 'SFCO2_OCN_atm_piControl_00_mon.nc'
fname = 'FG_CO2_ocn_piControl_00_mon.nc'
path = '/'.join([dir, fname])
ds = xr.open_dataset(path, decode_times=False)
ds_anom = esmlab.anomaly(ds, clim_freq='mon')
That might be a useful workaround until this issue can be resolved.
Thanks for the workaround @mnlevy1981
Another is to replace the problematic time
coordinate with a non-problematic time
coordinate.
This assumes you have a non-problematic time
coordinate, which fortunately I appear to have in my post-processed atm files. In particular, the following runs without error:
import xarray as xr
import esmlab
def repl_coord(coordname, ds1, ds2, deep=False):
"""
Return copy of d2 with coordinate coordname replaced, using coordname from ds1.
The returned Dataset is otherwise a copy of ds2.
The copy is deep or not depending on the argument deep.
"""
if 'bounds' in ds2[coordname].attrs:
tb_name = ds2[coordname].attrs['bounds']
ds_out = ds2.drop(tb_name).copy(deep)
else:
ds_out = ds2.copy(deep)
ds_out[coordname] = ds1[coordname]
if 'bounds' in ds1[coordname].attrs:
tb_name = ds1[coordname].attrs['bounds']
ds_out[tb_name] = ds1[tb_name]
return ds_out
dir = '/glade/work/klindsay/analysis/CESM2_coup_carb_cycle_JAMES/tseries'
fname = 'SFCO2_OCN_atm_piControl_00_mon.nc'
path = '/'.join([dir, fname])
ds1 = xr.open_dataset(path)
fname = 'TOTECOSYSC_lnd_piControl_00_mon.nc'
fname = 'FG_CO2_ocn_piControl_00_mon.nc'
path = '/'.join([dir, fname])
ds2 = xr.open_dataset(path)
ds2 = repl_coord('time', ds1, xr.open_dataset(path))
ds2_anom = esmlab.anomaly(ds2, clim_freq='mon')
I got bit by this again, in a workflow where there wasn't a convenient good time
coordinate to replace the bad one with. It's also not convenient to add decode_times=False
to the open_dataset/open_mfdataset
calls.
I found that modifying the time
values by a small amount, less than 1.0e-10
, can avoid the problem when it arises. So I put a kludge in my wrapper to esmlab.anomaly
to change the time values by small amounts if they are close to known problematic values. It's very kludgy.
Any chance that this will get looked at in the near future?
When I run the commands in the following code snippet, I get a ValueError exception. The Traceback is below the code snippet. The values listed in the
cftime.DatetimeNoLeap
call in the Traceback are fairly suspicious, as if something were uninitialized.The call to
esmlab.anomaly
works fine if I comment out the 2ndfname
setting, and use a timeseries from CESM2's atm component. I don't see a smoking gun difference between the file that triggers the exception and the file that doesn't.Code Snippet
Traceback
Output of
esmlab.show_versions()