pydata / xarray

N-D labeled arrays and datasets in Python
https://xarray.dev
Apache License 2.0
3.61k stars 1.08k forks source link

nightly failure with h5netcdf indexing #7154

Closed dcherian closed 2 years ago

dcherian commented 2 years ago

What happened?

From upstream-dev CI: Workflow Run URL

Python 3.10 Test Summary ``` xarray/tests/test_backends.py::TestH5NetCDFData::test_orthogonal_indexing: AssertionError: Left and right Dataset objects are not identical Differing coordinates: L numbers (dim3) int64 0 1 2 0 0 R numbers (dim3) int64 ... L * dim3 (dim3)

cc @benbovy @kmuehlbauer

Environment

INSTALLED VERSIONS ------------------ commit: 8eea8bb67bad0b5ac367c082125dd2b2519d4f52 python: 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:35:26) [GCC 10.4.0] python-bits: 64 OS: Linux OS-release: 5.15.0-1020-azure machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: C.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.8.1 xarray: 2022.9.1.dev12+g8eea8bb6 pandas: 1.6.0.dev0+297.g55dc32437e numpy: 1.24.0.dev0+896.g5ecaf36cd scipy: 1.10.0.dev0+2012.5be8bc4 netCDF4: 1.6.0 pydap: installed h5netcdf: 1.1.0.dev5+g1168b4f h5py: 3.7.0 Nio: None zarr: 2.13.4.dev1 cftime: 1.6.2 nc_time_axis: 1.3.1.dev34+g0999938 PseudoNetCDF: 3.2.2 rasterio: 1.4dev cfgrib: 0.9.10.2 iris: 3.3.1 bottleneck: 1.3.5 dask: 2022.9.2+17.g5ba240b9 distributed: 2022.9.2+19.g07e22593 matplotlib: 3.7.0.dev320+g834c89c512 cartopy: 0.21.0 seaborn: 0.12.0 numbagg: None fsspec: 2022.8.2+14.g3969aaf cupy: None pint: 0.19.3.dev87+g052a920 sparse: None flox: 0.5.11.dev3+g031979d numpy_groupies: 0.9.19 setuptools: 65.4.1 pip: 22.2.2 conda: None pytest: 7.1.3 IPython: None sphinx: None
kmuehlbauer commented 2 years ago

Thanks @dcherian for the ping. Most likely due to upstream change (recent merge) in h5netcdf. Investigating this the next days.

kmuehlbauer commented 2 years ago

OMG, that's already failing for 12 days. How could we miss this?

dcherian commented 2 years ago

our github action updates any existing open "nightly test failed" issue and we forgot to close the last one...

kmuehlbauer commented 2 years ago

I'll add two runs two h5netcdf CI, which check xarray h5netcdf related tests for latest release and latest commit to get an early warning if something breaks xarray.

What would be the correct pytest incantation to run all h5netcdf related tests?

dcherian commented 2 years ago

Thanks! This seems to work (see https://docs.pytest.org/en/7.1.x/how-to/usage.html#specifying-which-tests-to-run).

> pytest -k "H5NetCDF"
==================================================================================================================================================== test session starts =====================================================================================================================================================
platform darwin -- Python 3.10.6, pytest-7.1.3, pluggy-1.0.0
rootdir: /Users/dcherian/work/python/xarray, configfile: setup.cfg, testpaths: xarray/tests, properties
plugins: xdist-2.5.0, forked-1.4.0, env-0.6.2, hypothesis-6.56.1, cov-4.0.0
collected 15998 items / 15723 deselected / 1 skipped / 275 selected

xarray/tests/test_backends.py .......................X.......x.......................................ss...................................X.......x.......................................ss...................................X.......x.......................................ss..............................        [ 99%]
xarray/tests/test_distributed.py ..                                                                                                                                                                                                                                                                                    [100%]

====================================================================================================================================================== warnings summary ======================================================================================================================================================
xarray/tests/test_backends.py::TestH5NetCDFData::test_zero_dimensional_variable
  /Users/dcherian/mambaforge/envs/xarray-release/lib/python3.10/site-packages/cfgrib/xarray_plugin.py:11: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
    if LooseVersion(xr.__version__) <= "0.17.0":

xarray/tests/test_backends.py::TestH5NetCDFData::test_zero_dimensional_variable
  /Users/dcherian/mambaforge/envs/xarray-release/lib/python3.10/site-packages/setuptools/_distutils/version.py:346: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
    other = LooseVersion(other)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=============================================================================================================== 263 passed, 7 skipped, 15723 deselected, 3 xfailed, 3 xpassed, 2 warnings in 86.98s (0:01:26) ================================================================================================================
/Users/dcherian/mambaforge/envs/xarray-release/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 4 leaked semaphore objects to clean up at shutdown

And I do something similar in flox with the groupby tests. It works really well!

kmuehlbauer commented 2 years ago

Thanks a lot @dcherian, much appreciated.

kmuehlbauer commented 2 years ago

@dcherian As assumed the source was a regression in __getitem__-function over at h5netcdf. A patch is already merged. I've added a CI run against xarray github main to be on the safe side now and in future. Thanks for your help.

dcherian commented 2 years ago

Thanks for the quick fix @kmuehlbauer !