pydata / xarray

N-D labeled arrays and datasets in Python
Apache License 2.0
3.5k stars 1.04k forks source link

cftime resampling error #9108

Open dcherian opened 3 weeks ago

dcherian commented 3 weeks ago

What happened?

Something is very wrong with CFTime resampling for some inputs.

What did you expect to happen?

No error

Minimal Complete Verifiable Example

import dask.array
import numpy as np
import xarray as xr

ds = xr.Dataset(
    {"pr": ("time", dask.array.random.random((10,), chunks=(10,)))},
    coords={"time": xr.date_range("0001-01-01", periods=10, freq="D")},
ValueError: Data shape (9,) must match shape of object (10,)
spencerkclark commented 3 weeks ago

Can you show the output of xr.show_versions()? I am actually not able to reproduce this in the two environments I've tried (one has xarray main installed):

>>> xr.show_versions()

commit: None
python: 3.12.3 | packaged by conda-forge | (main, Apr 15 2024, 18:50:49) [Clang 16.0.6 ]
python-bits: 64
OS: Darwin
OS-release: 23.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.3
libnetcdf: 4.9.2

xarray: 2023.4.3.dev863+gce196d56
pandas: 2.2.2
numpy: 1.26.4
scipy: 1.13.1
netCDF4: 1.6.5
pydap: installed
h5netcdf: 1.3.0
h5py: 3.11.0
zarr: 2.18.2
cftime: 1.6.3
nc_time_axis: 1.4.1
iris: 3.9.0
bottleneck: 1.3.8
dask: 2024.5.2
distributed: 2024.5.2
matplotlib: 3.8.4
cartopy: 0.23.0
seaborn: 0.13.2
numbagg: 0.8.1
fsspec: 2024.6.0
cupy: None
pint: None
sparse: 0.15.4
flox: 0.9.8
numpy_groupies: 0.11.1
setuptools: 70.0.0
pip: 24.0
conda: None
pytest: 8.2.2
mypy: None
IPython: None
sphinx: None
dcherian commented 3 weeks ago

Here are the versions:

``` INSTALLED VERSIONS ------------------ commit: None python: 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:34:54) [Clang 16.0.6 ] python-bits: 64 OS: Darwin OS-release: 23.2.0 machine: arm64 processor: arm byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.14.3 libnetcdf: 4.9.2 xarray: 2024.5.0 pandas: 2.2.2 numpy: 1.26.4 scipy: 1.13.1 netCDF4: 1.6.5 pydap: None h5netcdf: 1.3.0 h5py: 3.11.0 zarr: 2.18.0 cftime: 1.6.4 nc_time_axis: None iris: None bottleneck: None dask: 2024.5.2 distributed: 2024.5.2 matplotlib: 3.8.4 cartopy: 0.23.0 seaborn: 0.13.2 numbagg: 0.8.1 fsspec: 2024.6.0 cupy: None pint: 0.23 sparse: 0.15.4 flox: 0.9.8 numpy_groupies: 0.11.1 setuptools: 70.0.0 pip: 24.0 conda: None pytest: 8.2.2 mypy: None IPython: 8.25.0 sphinx: 7.3.7 ```

I get a bunch of these warnings too:

[/Users/deepak/miniforge3/envs/xarray-release/lib/python3.11/site-packages/xarray/coding/](http://localhost:8888/lab/tree/repos/devel/xarray/miniforge3/envs/xarray-release/lib/python3.11/site-packages/xarray/coding/ CFWarning: year=0 was specified - this date[/calendar/year](http://localhost:8888/calendar/year) zero convention is not supported by CF
  reference = type(date)(year, month, 1)
spencerkclark commented 3 weeks ago

Yeah, I get those warnings too. We may decide to do something to silence those, but I think that's a separate issue (not that it necessarily excuses it, but I think they have existed for a while for this case).

Weirdly I still cannot reproduce the ValueError with an environment built to be just like yours:

``` $ python Python 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:45:13) [Clang 16.0.6 ] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import dask.array; import numpy as np; import xarray as xr >>> ds = xr.Dataset({"pr": ("time", dask.array.random.random((10,), chunks=(10,)))},coords={"time": xr.date_range("0001-01-01", periods=10, freq="D")},) >>> ds.resample(time="ME") /Users/spencer/mambaforge/envs/2024-06-13-cftime-resample-env/lib/python3.11/site-packages/xarray/coding/ CFWarning: year=0 was specified - this date/calendar/year zero convention is not supported by CF reference = type(date)(year, month, 1) /Users/spencer/mambaforge/envs/2024-06-13-cftime-resample-env/lib/python3.11/site-packages/xarray/coding/ CFWarning: this date/calendar/year zero convention is not supported by CF reference = type(date)(year, month, 1) /Users/spencer/mambaforge/envs/2024-06-13-cftime-resample-env/lib/python3.11/site-packages/xarray/coding/ CFWarning: this date/calendar/year zero convention is not supported by CF return (reference - timedelta(days=1)).day /Users/spencer/mambaforge/envs/2024-06-13-cftime-resample-env/lib/python3.11/site-packages/xarray/coding/ CFWarning: this date/calendar/year zero convention is not supported by CF return date.replace(year=year, month=month, day=day) /Users/spencer/mambaforge/envs/2024-06-13-cftime-resample-env/lib/python3.11/site-packages/xarray/coding/ CFWarning: this date/calendar/year zero convention is not supported by CF return (reference - timedelta(days=1)).day /Users/spencer/mambaforge/envs/2024-06-13-cftime-resample-env/lib/python3.11/site-packages/xarray/coding/ CFWarning: this date/calendar/year zero convention is not supported by CF return CFTimeIndex(np.array(self) + other) DatasetResample, grouped over '__resample_dim__' 1 groups with labels 0001-01-31, 00:00:00. >>> ds.resample(time="ME").mean().compute() /Users/spencer/mambaforge/envs/2024-06-13-cftime-resample-env/lib/python3.11/site-packages/xarray/coding/ CFWarning: year=0 was specified - this date/calendar/year zero convention is not supported by CF reference = type(date)(year, month, 1) /Users/spencer/mambaforge/envs/2024-06-13-cftime-resample-env/lib/python3.11/site-packages/xarray/coding/ CFWarning: this date/calendar/year zero convention is not supported by CF reference = type(date)(year, month, 1) /Users/spencer/mambaforge/envs/2024-06-13-cftime-resample-env/lib/python3.11/site-packages/xarray/coding/ CFWarning: this date/calendar/year zero convention is not supported by CF return (reference - timedelta(days=1)).day /Users/spencer/mambaforge/envs/2024-06-13-cftime-resample-env/lib/python3.11/site-packages/xarray/coding/ CFWarning: this date/calendar/year zero convention is not supported by CF return date.replace(year=year, month=month, day=day) /Users/spencer/mambaforge/envs/2024-06-13-cftime-resample-env/lib/python3.11/site-packages/xarray/coding/ CFWarning: this date/calendar/year zero convention is not supported by CF return (reference - timedelta(days=1)).day /Users/spencer/mambaforge/envs/2024-06-13-cftime-resample-env/lib/python3.11/site-packages/xarray/coding/ CFWarning: this date/calendar/year zero convention is not supported by CF return CFTimeIndex(np.array(self) + other) OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead. Size: 16B Dimensions: (time: 1) Coordinates: * time (time) object 8B 0001-01-31 00:00:00 Data variables: pr (time) float64 8B 0.4763 >>> xr.show_versions() /Users/spencer/mambaforge/envs/2024-06-13-cftime-resample-env/lib/python3.11/site-packages/_distutils_hack/ UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.") INSTALLED VERSIONS ------------------ commit: None python: 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:45:13) [Clang 16.0.6 ] python-bits: 64 OS: Darwin OS-release: 23.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.14.3 libnetcdf: 4.9.2 xarray: 2024.5.0 pandas: 2.2.2 numpy: 1.26.4 scipy: 1.13.1 netCDF4: 1.6.5 pydap: None h5netcdf: 1.3.0 h5py: 3.11.0 zarr: 2.18.0 cftime: 1.6.4 nc_time_axis: None iris: None bottleneck: None dask: 2024.5.2 distributed: 2024.5.2 matplotlib: 3.8.4 cartopy: 0.23.0 seaborn: 0.13.2 numbagg: 0.8.1 fsspec: 2024.6.0 cupy: None pint: 0.23 sparse: 0.15.4 flox: 0.9.8 numpy_groupies: 0.11.1 setuptools: 70.0.0 pip: 24.0 conda: None pytest: 8.2.2 mypy: None IPython: 8.25.0 sphinx: 7.3.7 ```

This is the only diff in versions:

$ diff Deepak Spencer
< python: 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:34:54) [Clang 16.0.6 ]
> python: 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:45:13) [Clang 16.0.6 ]
< OS-release: 23.2.0
< machine: arm64
< processor: arm
> OS-release: 23.5.0
> machine: x86_64
> processor: i386