pydata / xarray

N-D labeled arrays and datasets in Python
https://xarray.dev
Apache License 2.0
3.59k stars 1.08k forks source link

Error time slicing for some CMIP6 models #3627

Closed mikebyrne6 closed 4 years ago

mikebyrne6 commented 4 years ago

Hi there,

I'm using xarray's open_zarr() function to analyse CMIP6 data:

# Function to load data: df_data has the catalogue of the variable of interest
def load_data(df_data, source_id, expt_id):
    """
    Load data for given variable, source and expt ids.
    """
    uri = df_data[(df_data.source_id == source_id) &
                  (df_data.experiment_id == expt_id)].zstore.values[0]

    gcs = gcsfs.GCSFileSystem(token='anon')
    ds = xr.open_zarr(gcs.get_mapper(uri), consolidated=True)
    return ds

# Just test with 1 model for now:
source_ids_tmp = ['CESM2']

for model_name in source_ids_tmp:
    print('\n\nStarting ' + model_name +'\n')
    ds_hist = load_data(df_mon_tas, model_name, experiment_ids[0]).sel(time=slice('1976', '2005'))

Problem Description

However, the time slicing fails with the following error:

TypeError: cannot compare netcdftime._netcdftime.DatetimeNoLeap(1932, 7, 15, 12, 0, 0, 0, 1, 196) and '1976'

Looking at the Dataset metedata, the time variable is described as follows:

  • time (time) object 1850-01-15 12:00:00 ... 2014-12-15 12:00:00 time_bnds (time, nbnd) object dask.array<chunksize=(1980, 2)>

For another CMIP6 model (MIROC6), the time slicing works fine. That model has the following metadat for time:

  • time (time) datetime64[ns] 1850-01-16T12:00:00 ... 2014-12-16T12:00:00 time_bnds (time, bnds) datetime64[ns] dask.array<chunksize=(1980, 2)>

When the time variable is converted to a datetime[ns] object, time slicing seems to work. Any idea what the problem is?

Thanks,

Mike

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None libhdf5: 1.10.1 libnetcdf: 4.4.1.1 xarray: 0.14.0 pandas: 0.25.1 numpy: 1.16.2 scipy: 1.2.1 netCDF4: 1.3.1 pydap: installed h5netcdf: 0.5.0 h5py: 2.7.0 Nio: None zarr: 2.3.2 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 0.15.3 distributed: 1.19.1 matplotlib: 3.0.0 cartopy: 0.16.0 seaborn: 0.9.0 numbagg: None setuptools: 36.5.0.post20170921 pip: 19.0.3 conda: 4.4.6 pytest: 3.3.0 IPython: 6.1.0 sphinx: 1.6.3
spencerkclark commented 4 years ago

Hi @mikebyrne6 -- it looks like cftime is not installed on your system. Could you try installing that and trying again? Time-indexing functionality for non-standard calendars through a CFTimeIndex is only supported through that.

$ conda install -c conda-forge cftime
mikebyrne6 commented 4 years ago

Hi @spencerkclark,

Thanks a lot for the quick response! Indeed I did not have cftime installed. After installing, the time-slicing error has evolved:

~/anaconda3/lib/python3.6/site-packages/xarray/coding/cftimeindex.py in _parse_iso8601_with_reso(date_type, timestr) 114 # 1.0.3.4. 115 replace["dayofwk"] = -1 --> 116 return default.replace(**replace), resolution 117 118

cftime/_cftime.pyx in cftime._cftime.datetime.replace()

ValueError: Replacing the dayofyr or dayofwk of a datetime is not supported.

Any ideas here?

Cheers,

Mike

spencerkclark commented 4 years ago

Ah yes, there were some changes in the latest version of cftime that we needed to accommodate in xarray (see https://github.com/pydata/xarray/pull/3430). Try upgrading xarray to version 0.14.1 and I think you should be good:

$ conda upgrade -c conda-forge xarray
mikebyrne6 commented 4 years ago

Thanks again @spencerkclark! I upgraded xarray as suggested but still getting the error directly above...

spencerkclark commented 4 years ago

Could you add a few more details to your example above so that I can try reproducing the issue? I think example values of df_mon_tas and experiment_ids would be all I need.

mikebyrne6 commented 4 years ago

The data I'm trying to work with are for CESM2 ('historical' simulation) and are stored in the Google Cloud at:

gs://cmip6/CMIP/NCAR/CESM2/historical/r1i1p1f1/Amon/tas/gn/

spencerkclark commented 4 years ago

Hmm...I can't seem to reproduce the issue. Are you sure you are using the latest release of xarray?

In [1]: import gcsfs; import xarray as xr

In [2]: gcs = gcsfs.GCSFileSystem(token='anon')

In [3]: mapper = gcs.get_mapper('gs://cmip6/CMIP/NCAR/CESM2/historical/r1i1p1f1/
   ...: Amon/tas/gn/')

In [4]: ds = xr.open_zarr(mapper, consolidated=True)

In [5]: ds.sel(time=slice('1975', '2005'))
Out[5]:
<xarray.Dataset>
Dimensions:    (lat: 192, lon: 288, nbnd: 2, time: 372)
Coordinates:
  * lat        (lat) float64 -90.0 -89.06 -88.12 -87.17 ... 88.12 89.06 90.0
    lat_bnds   (lat, nbnd) float32 dask.array<chunksize=(192, 2), meta=np.ndarray>
  * lon        (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8
    lon_bnds   (lon, nbnd) float32 dask.array<chunksize=(288, 2), meta=np.ndarray>
  * time       (time) object 1975-01-15 12:00:00 ... 2005-12-15 12:00:00
    time_bnds  (time, nbnd) object dask.array<chunksize=(372, 2), meta=np.ndarray>
Dimensions without coordinates: nbnd
Data variables:
    tas        (time, lat, lon) float32 dask.array<chunksize=(300, 192, 288), meta=np.ndarray>
Attributes:
    Conventions:            CF-1.7 CMIP-6.2
    activity_id:            CMIP
    branch_method:          standard
    branch_time_in_child:   674885.0
    branch_time_in_parent:  219000.0
    case_id:                15
    cesm_casename:          b.e21.BHIST.f09_g17.CMIP6-historical.001
    contact:                cesm_cmip6@ucar.edu
    creation_date:          2019-01-16T23:34:05Z
    data_specs_version:     01.00.29
    experiment:             all-forcing simulation of the recent past
    experiment_id:          historical
    external_variables:     areacella
    forcing_index:          1
    frequency:              mon
    further_info_url:       https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2.h...
    grid:                   native 0.9x1.25 finite volume grid (192x288 latxlon)
    grid_label:             gn
    initialization_index:   1
    institution:            National Center for Atmospheric Research, Climate...
    institution_id:         NCAR
    license:                CMIP6 model data produced by <The National Center...
    mip_era:                CMIP6
    model_doi_url:          https://doi.org/10.5065/D67H1H0V
    nominal_resolution:     100 km
    parent_activity_id:     CMIP
    parent_experiment_id:   piControl
    parent_mip_era:         CMIP6
    parent_source_id:       CESM2
    parent_time_units:      days since 0001-01-01 00:00:00
    parent_variant_label:   r1i1p1f1
    physics_index:          1
    product:                model-output
    realization_index:      1
    realm:                  atmos
    source:                 CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite v...
    source_id:              CESM2
    source_type:            AOGCM BGC
    sub_experiment:         none
    sub_experiment_id:      none
    table_id:               Amon
    tracking_id:            hdl:21.14100/d9a7225a-49c3-4470-b7ab-a8180926f839
    variable_id:            tas
    variant_info:           CMIP6 20th century experiments (1850-2014) with C...
    variant_label:          r1i1p1f1
    status:                 2019-10-25;created;by nhn2@columbia.edu

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 14:38:56) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 19.0.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: None xarray: 0.14.1 pandas: 0.25.0 numpy: 1.17.0 scipy: 1.3.1 netCDF4: None pydap: installed h5netcdf: 0.7.4 h5py: 2.9.0 Nio: None zarr: 2.3.2 cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: 1.0.25 cfgrib: 0.9.7.1 iris: None bottleneck: 1.2.1 dask: 2.9.0+2.gd0daa5bc distributed: 2.9.0 matplotlib: 3.1.1 cartopy: None seaborn: 0.9.0 numbagg: installed setuptools: 42.0.2.post20191201 pip: 19.2.2 conda: None pytest: 5.0.1 IPython: 7.10.1 sphinx: None
mikebyrne6 commented 4 years ago

You're right, I'm stilling running the 0.14.0 version... Somehow the upgrade to 0.14.1 did not work, will try again...

mikebyrne6 commented 4 years ago

All sorted @spencerkclark, many thanks for so generously helping out an xarray beginner!

rabernat commented 4 years ago

I hope you enjoy drinking our kool-aid @mikebyrne6! Thanks for a useful bug report. I predict we will be seeing more of you around here. 😉