euroargodev / argopy

A python library for Argo data beginners and experts
https://argopy.readthedocs.io
European Union Public License 1.2
186 stars 41 forks source link

Allow xarray>2024.03 in environment #390

Open scottyhq opened 2 months ago

scottyhq commented 2 months ago

We're trying to support argopy alongside other packages in the pangeo docker image (https://github.com/pangeo-data/pangeo-docker-images/pull/577)

But v0.1.16 pins xarray to <2024.03 https://github.com/euroargodev/argopy/issues/373#issuecomment-2312421966

My understanding of the linked upstream issue https://github.com/pydata/xarray/issues/8909 is that the ScipyArrayWrapper' object has no attribute 'oindex' error arises for the specific case of overwriting a netCDF file in an environment with scipy installed but without netCDF4... But argopy does list netCDF4 as a dependency. So @gmaze could you please provide a bit more context around where the problem is seen for argopy? Cheers!

gmaze commented 2 months ago

thanks for reporting this ! indeed this xarray pinning is not satisfactory

the error was raised in CI tests and later reported by some users that's why we found that a temporary solution was to pin xarray to <2024.03, not really to check on the netCDF4 availability in the environnement, it is a priori there in CI tests

let me double check if I can reproduce the error in an env where I'm sure netCDF4 is available

and we have the next release coming up soon (v0.1.17), surely before the end of September xarray pinning should be removed at this point

gmaze commented 2 months ago

@scottyhq

I can reproduce the error using the following environnement file:

```yaml name: argopy-xarray-unpinned channels: - conda-forge dependencies: - python = 3.10.14 # CORE: - aiohttp - decorator - erddapy - fsspec - netCDF4 - packaging - requests - scipy - toolz - xarray # EXT.UTIL: - boto3 - gsw - s3fs > 2023.12.12 - tqdm - zarr # EXT.PERF: - dask - distributed - h5netcdf - pyarrow # EXT.PLOT: - IPython - cartopy - ipykernel - ipywidgets - matplotlib - pyproj - seaborn # DEV: - aiofiles - black - bottleneck - cfgrib - cftime - codespell - flake8 - numpy - pandas - pip - pytest - pytest-cov - pytest-env - pytest-localftpserver - setuptools # - sphinx # PIP: - pip: - pytest-reportlog ```

That lead to these librairies:

```python >>> argopy.show_versions() SYSTEM ------ commit: None python: 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:53:34) [Clang 16.0.6 ] python-bits: 64 OS: Darwin OS-release: 21.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.14.3 libnetcdf: 4.9.2 INSTALLED VERSIONS: CORE ------------------------ aiohttp : 3.10.5 argopy : 0.1.16 decorator : 5.1.1 erddapy : 2.2.0 fsspec : 2024.9.0 netCDF4 : 1.7.1 packaging : 24.1 requests : 2.32.3 scipy : 1.14.1 toolz : 0.12.1 xarray : 2024.7.0 INSTALLED VERSIONS: EXT.UTIL ---------------------------- boto3 : 1.35.7 gsw : 3.6.19 s3fs : 2024.9.0 tqdm : 4.66.5 zarr : 2.18.3 INSTALLED VERSIONS: EXT.PERF ---------------------------- dask : 2024.8.2 distributed : 2024.8.2 h5netcdf : 1.3.0 pyarrow : 17.0.0 INSTALLED VERSIONS: EXT.PLOT ---------------------------- IPython : 8.27.0 cartopy : 0.23.0 ipykernel : 6.29.5 ipywidgets : 8.1.5 matplotlib : 3.9.2 pyproj : 3.6.1 seaborn : 0.13.2 INSTALLED VERSIONS: DEV ----------------------- aiofiles : 24.1.0 black : 24.8.0 bottleneck : 1.4.0 cfgrib : 0.9.14.0 cftime : 1.6.4 codespell : 2.3.0 flake8 : 7.1.1 numpy : 2.1.1 pandas : 2.2.2 pip : 24.2 pytest : 8.3.3 pytest_cov : 5.0.0 pytest_env : 1.1.4 pytest_localftpserver: - setuptools : 73.0.1 sphinx : - INSTALLED VERSIONS: PIP ----------------------- pytest-reportlog: 0.4.0 ```

hence including xarray 2024.7.0, scipy 1.14.1 and netCDF4 1.7.1

the classic argopy data fetching command :

import argopy
argopy.DataFetcher().float(6902746).to_xarray()

Returns the following traceback:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/gmaze/git/github/euroargodev/argopy/argopy/fetchers.py", line 490, in to_xarray
    xds = self.postproccessor(xds)
  File "/Users/gmaze/git/github/euroargodev/argopy/argopy/fetchers.py", line 356, in postprocessing
    xds = self.fetcher.filter_data_mode(xds)
  File "/Users/gmaze/git/github/euroargodev/argopy/argopy/data_fetchers/erddap_data.py", line 834, in filter_data_mode
    ds = ds.argo.filter_data_mode(errors="ignore", **kwargs)
  File "/Users/gmaze/git/github/euroargodev/argopy/argopy/xarray.py", line 754, in filter_data_mode
    argo_r, argo_a, argo_d = ds_split_datamode(ds)
  File "/Users/gmaze/git/github/euroargodev/argopy/argopy/xarray.py", line 646, in ds_split_datamode
    argo_r = safe_where_eq(xds, "DATA_MODE", "R")
  File "/Users/gmaze/git/github/euroargodev/argopy/argopy/xarray.py", line 617, in safe_where_eq
    return xds.where(xds[key] == value, drop=True)
  File "/Users/gmaze/miniconda3/envs/argopy-xarray-unpinned/lib/python3.10/site-packages/xarray/core/common.py", line 1212, in where
    return ops.where_method(self, cond, other)
  File "/Users/gmaze/miniconda3/envs/argopy-xarray-unpinned/lib/python3.10/site-packages/xarray/core/ops.py", line 179, in where_method
    return apply_ufunc(
  File "/Users/gmaze/miniconda3/envs/argopy-xarray-unpinned/lib/python3.10/site-packages/xarray/core/computation.py", line 1255, in apply_ufunc
    return apply_dataset_vfunc(
  File "/Users/gmaze/miniconda3/envs/argopy-xarray-unpinned/lib/python3.10/site-packages/xarray/core/computation.py", line 523, in apply_dataset_vfunc
    list_of_coords, list_of_indexes = build_output_coords_and_indexes(
  File "/Users/gmaze/miniconda3/envs/argopy-xarray-unpinned/lib/python3.10/site-packages/xarray/core/computation.py", line 252, in build_output_coords_and_indexes
    merged_vars, merged_indexes = merge_coordinates_without_align(
  File "/Users/gmaze/miniconda3/envs/argopy-xarray-unpinned/lib/python3.10/site-packages/xarray/core/merge.py", line 413, in merge_coordinates_without_align
    merged_coords, merged_indexes = merge_collected(
  File "/Users/gmaze/miniconda3/envs/argopy-xarray-unpinned/lib/python3.10/site-packages/xarray/core/merge.py", line 290, in merge_collected
    merged_vars[name] = unique_variable(
  File "/Users/gmaze/miniconda3/envs/argopy-xarray-unpinned/lib/python3.10/site-packages/xarray/core/merge.py", line 123, in unique_variable
    out = out.set_dims(dim_lengths)
  File "/Users/gmaze/miniconda3/envs/argopy-xarray-unpinned/lib/python3.10/site-packages/xarray/util/deprecation_helpers.py", line 140, in wrapper
    return func(*args, **kwargs)
  File "/Users/gmaze/miniconda3/envs/argopy-xarray-unpinned/lib/python3.10/site-packages/xarray/core/variable.py", line 1377, in set_dims
    expanded_data = self.data
  File "/Users/gmaze/miniconda3/envs/argopy-xarray-unpinned/lib/python3.10/site-packages/xarray/core/variable.py", line 449, in data
    return self._data.get_duck_array()
  File "/Users/gmaze/miniconda3/envs/argopy-xarray-unpinned/lib/python3.10/site-packages/xarray/core/indexing.py", line 837, in get_duck_array
    self._ensure_cached()
  File "/Users/gmaze/miniconda3/envs/argopy-xarray-unpinned/lib/python3.10/site-packages/xarray/core/indexing.py", line 831, in _ensure_cached
    self.array = as_indexable(self.array.get_duck_array())
  File "/Users/gmaze/miniconda3/envs/argopy-xarray-unpinned/lib/python3.10/site-packages/xarray/core/indexing.py", line 788, in get_duck_array
    return self.array.get_duck_array()
  File "/Users/gmaze/miniconda3/envs/argopy-xarray-unpinned/lib/python3.10/site-packages/xarray/core/indexing.py", line 647, in get_duck_array
    array = apply_indexer(self.array, self.key)
  File "/Users/gmaze/miniconda3/envs/argopy-xarray-unpinned/lib/python3.10/site-packages/xarray/core/indexing.py", line 1028, in apply_indexer
    return indexable.oindex[indexer]
  File "/Users/gmaze/miniconda3/envs/argopy-xarray-unpinned/lib/python3.10/site-packages/xarray/core/indexing.py", line 368, in __getitem__
    return self.getter(key)
  File "/Users/gmaze/miniconda3/envs/argopy-xarray-unpinned/lib/python3.10/site-packages/xarray/coding/variables.py", line 72, in _oindex_get
    return type(self)(self.array.oindex[key], self.func, self.dtype)
AttributeError: 'ScipyArrayWrapper' object has no attribute 'oindex'

It's a xarray.Dataset.where statement that raises the issue

gmaze commented 4 weeks ago

I added a nighlty test to monitor this issue: https://github.com/euroargodev/argopy/actions/workflows/pytests-upstream-xarray.yml