NCAR / esmlab

Earth System Model Lab (esmlab). ⚠️⚠️ ESMLab functionality has been moved into <https://github.com/NCAR/geocat-comp>. ⚠️⚠️
https://esmlab.readthedocs.io
Apache License 2.0
24 stars 8 forks source link

resample returns meaningless dates #127

Closed matt-long closed 5 years ago

matt-long commented 5 years ago

Resampling daily to monthly data using resample returns an erroneous time axis and time_bounds.

For instance:

case = 'b.e21.BWHIST.f09_g17.CMIP6-historical-WACCM.001'
droot = f'/glade/p/cgd/tss/people/oleson/{case}/cpl/hist.mon'
files = sorted(glob(f'{droot}/{case}.cpl.ha2x1d.2000-??.nc'))
ds = xr.open_mfdataset(files)
dsm = esmlab.resample(ds, freq='mon')

The original dataset was a year of daily data

ds.time

<xarray.DataArray 'time' (time: 365)>
array([cftime.DatetimeNoLeap(2000, 1, 1, 12, 0, 0, 0, 5, 1),
       cftime.DatetimeNoLeap(2000, 1, 2, 12, 0, 0, 0, 6, 2),
       cftime.DatetimeNoLeap(2000, 1, 3, 12, 0, 0, 0, 0, 3), ...,
       cftime.DatetimeNoLeap(2000, 12, 29, 12, 0, 0, 0, 3, 363),
       cftime.DatetimeNoLeap(2000, 12, 30, 12, 0, 0, 0, 4, 364),
       cftime.DatetimeNoLeap(2000, 12, 31, 12, 0, 0, 0, 5, 365)], dtype=object)
Coordinates:
  * time     (time) object 2000-01-01 12:00:00 ... 2000-12-31 12:00:00
Attributes:
    bounds:   time_bnds

The monthly dataset mysteriously begins on Jan 2 1850 and appears to be daily data, but is in fact monthly.

dsm.time

<xarray.DataArray 'time' (time: 12)>
array([cftime.DatetimeNoLeap(1850, 1, 2, 0, 0, 0, 0, 3, 2),
       cftime.DatetimeNoLeap(1850, 1, 3, 0, 0, 0, 0, 4, 3),
       cftime.DatetimeNoLeap(1850, 1, 4, 0, 0, 0, 0, 5, 4),
       cftime.DatetimeNoLeap(1850, 1, 5, 0, 0, 0, 0, 6, 5),
       cftime.DatetimeNoLeap(1850, 1, 6, 0, 0, 0, 0, 0, 6),
       cftime.DatetimeNoLeap(1850, 1, 7, 0, 0, 0, 0, 1, 7),
       cftime.DatetimeNoLeap(1850, 1, 8, 0, 0, 0, 0, 2, 8),
       cftime.DatetimeNoLeap(1850, 1, 9, 0, 0, 0, 0, 3, 9),
       cftime.DatetimeNoLeap(1850, 1, 10, 0, 0, 0, 0, 4, 10),
       cftime.DatetimeNoLeap(1850, 1, 11, 0, 0, 0, 0, 5, 11),
       cftime.DatetimeNoLeap(1850, 1, 12, 0, 0, 0, 0, 6, 12),
       cftime.DatetimeNoLeap(1850, 1, 13, 0, 0, 0, 0, 0, 13)], dtype=object)
Coordinates:
  * time     (time) object 1850-01-02 00:00:00 ... 1850-01-13 00:00:00
Attributes:
    bounds:   time_bnds

Even worse, time_bnds:

dsm.time_bnds

<xarray.DataArray 'time_bnds' (time: 12, ntb: 2)>
array([[cftime.DatetimeNoLeap(2000, 1, 1, 0, 0, 0, 0, 5, 1),
        cftime.DatetimeNoLeap(2000, 2, 1, 0, 0, 0, 0, 1, 32)],
       [cftime.DatetimeNoLeap(2000, 2, 1, 0, 0, 0, 0, 1, 32),
        cftime.DatetimeNoLeap(2000, 3, 1, 0, 0, 0, 0, 1, 60)],
       [cftime.DatetimeNoLeap(2000, 3, 1, 0, 0, 0, 0, 1, 60),
        cftime.DatetimeNoLeap(2000, 4, 1, 0, 0, 0, 0, 4, 91)],
       [cftime.DatetimeNoLeap(2000, 4, 1, 0, 0, 0, 0, 4, 91),
        cftime.DatetimeNoLeap(2000, 5, 1, 0, 0, 0, 0, 6, 121)],
       [cftime.DatetimeNoLeap(2000, 5, 1, 0, 0, 0, 0, 6, 121),
        cftime.DatetimeNoLeap(2000, 6, 1, 0, 0, 0, 0, 2, 152)],
       [cftime.DatetimeNoLeap(2000, 6, 1, 0, 0, 0, 0, 2, 152),
        cftime.DatetimeNoLeap(2000, 7, 1, 0, 0, 0, 0, 4, 182)],
       [cftime.DatetimeNoLeap(2000, 7, 1, 0, 0, 0, 0, 4, 182),
        cftime.DatetimeNoLeap(2000, 8, 1, 0, 0, 0, 0, 0, 213)],
       [cftime.DatetimeNoLeap(2000, 8, 1, 0, 0, 0, 0, 0, 213),
        cftime.DatetimeNoLeap(2000, 9, 1, 0, 0, 0, 0, 3, 244)],
       [cftime.DatetimeNoLeap(2000, 9, 1, 0, 0, 0, 0, 3, 244),
        cftime.DatetimeNoLeap(2000, 10, 1, 0, 0, 0, 0, 5, 274)],
       [cftime.DatetimeNoLeap(2000, 10, 1, 0, 0, 0, 0, 5, 274),
        cftime.DatetimeNoLeap(2000, 11, 1, 0, 0, 0, 0, 1, 305)],
       [cftime.DatetimeNoLeap(2000, 11, 1, 0, 0, 0, 0, 1, 305),
        cftime.DatetimeNoLeap(2000, 12, 1, 0, 0, 0, 0, 3, 335)],
       [cftime.DatetimeNoLeap(2000, 12, 1, 0, 0, 0, 0, 3, 335),
        cftime.DatetimeNoLeap(2001, 1, 1, 0, 0, 0, 0, 6, 1)]], dtype=object)
Coordinates:
  * time     (time) object 1850-01-02 00:00:00 ... 1850-01-13 00:00:00
Dimensions without coordinates: ntb
andersy005 commented 5 years ago

@matt-long, with #133, I added a new optional argument, method to allow users to control how they want the time values to be computed:

By default, midpoint values are computed:

dsm = esmlab.resample(ds, freq='mon')
dsm.time
<xarray.DataArray 'time' (time: 12)>
array([cftime.DatetimeNoLeap(2000, 1, 16, 12, 0, 0, 0, 6, 16),
       cftime.DatetimeNoLeap(2000, 2, 15, 0, 0, 0, 0, 1, 46),
       cftime.DatetimeNoLeap(2000, 3, 16, 12, 0, 0, 0, 2, 75),
       cftime.DatetimeNoLeap(2000, 4, 16, 0, 0, 0, 0, 5, 106),
       cftime.DatetimeNoLeap(2000, 5, 16, 12, 0, 0, 0, 0, 136),
       cftime.DatetimeNoLeap(2000, 6, 16, 0, 0, 0, 0, 3, 167),
       cftime.DatetimeNoLeap(2000, 7, 16, 12, 0, 0, 0, 5, 197),
       cftime.DatetimeNoLeap(2000, 8, 16, 12, 0, 0, 0, 1, 228),
       cftime.DatetimeNoLeap(2000, 9, 16, 0, 0, 0, 0, 4, 259),
       cftime.DatetimeNoLeap(2000, 10, 16, 12, 0, 0, 0, 6, 289),
       cftime.DatetimeNoLeap(2000, 11, 16, 0, 0, 0, 0, 2, 320),
       cftime.DatetimeNoLeap(2000, 12, 16, 12, 0, 0, 0, 4, 350)], dtype=object)
Coordinates:
  * time     (time) object 2000-01-16 12:00:00 ... 2000-12-16 12:00:00
Attributes:
    bounds:   time_bnds
dsm = esmlab.resample(ds, freq='mon', method='left')
dsm.time
<xarray.DataArray 'time' (time: 12)>
array([cftime.DatetimeNoLeap(2000, 1, 1, 12, 0, 0, 0, 5, 1),
       cftime.DatetimeNoLeap(2000, 2, 1, 12, 0, 0, 0, 1, 32),
       cftime.DatetimeNoLeap(2000, 3, 1, 12, 0, 0, 0, 1, 60),
       cftime.DatetimeNoLeap(2000, 4, 1, 12, 0, 0, 0, 4, 91),
       cftime.DatetimeNoLeap(2000, 5, 1, 12, 0, 0, 0, 6, 121),
       cftime.DatetimeNoLeap(2000, 6, 1, 12, 0, 0, 0, 2, 152),
       cftime.DatetimeNoLeap(2000, 7, 1, 12, 0, 0, 0, 4, 182),
       cftime.DatetimeNoLeap(2000, 8, 1, 12, 0, 0, 0, 0, 213),
       cftime.DatetimeNoLeap(2000, 9, 1, 12, 0, 0, 0, 3, 244),
       cftime.DatetimeNoLeap(2000, 10, 1, 12, 0, 0, 0, 5, 274),
       cftime.DatetimeNoLeap(2000, 11, 1, 12, 0, 0, 0, 1, 305),
       cftime.DatetimeNoLeap(2000, 12, 1, 12, 0, 0, 0, 3, 335)], dtype=object)
Coordinates:
  * time     (time) object 2000-01-01 12:00:00 ... 2000-12-01 12:00:00
Attributes:
    bounds:   time_bnds
dsm = esmlab.resample(ds, freq='mon', method='right')
dsm.time
<xarray.DataArray 'time' (time: 12)>
array([cftime.DatetimeNoLeap(2000, 1, 31, 12, 0, 0, 0, 0, 31),
       cftime.DatetimeNoLeap(2000, 2, 28, 12, 0, 0, 0, 0, 59),
       cftime.DatetimeNoLeap(2000, 3, 31, 12, 0, 0, 0, 3, 90),
       cftime.DatetimeNoLeap(2000, 4, 30, 12, 0, 0, 0, 5, 120),
       cftime.DatetimeNoLeap(2000, 5, 31, 12, 0, 0, 0, 1, 151),
       cftime.DatetimeNoLeap(2000, 6, 30, 12, 0, 0, 0, 3, 181),
       cftime.DatetimeNoLeap(2000, 7, 31, 12, 0, 0, 0, 6, 212),
       cftime.DatetimeNoLeap(2000, 8, 31, 12, 0, 0, 0, 2, 243),
       cftime.DatetimeNoLeap(2000, 9, 30, 12, 0, 0, 0, 4, 273),
       cftime.DatetimeNoLeap(2000, 10, 31, 12, 0, 0, 0, 0, 304),
       cftime.DatetimeNoLeap(2000, 11, 30, 12, 0, 0, 0, 2, 334),
       cftime.DatetimeNoLeap(2000, 12, 31, 12, 0, 0, 0, 5, 365)], dtype=object)
Coordinates:
  * time     (time) object 2000-01-31 12:00:00 ... 2000-12-31 12:00:00
Attributes:
    bounds:   time_bnds