Open ocraft opened 8 months ago
Thanks for opening your first issue here at xarray! Be sure to follow the issue template! If you have an idea for a solution, we would really welcome a Pull Request with proposed changes. See the Contributing Guide for more. It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better. Thank you!
It's important to note, that the example works on xarray==2024.2.0, the problem exists in 2024.3.0.
Yes sorry about that.
@andersy005 we should probably roll back the changes to coding/*.py
and bundle them in the backends feature branch
FWIW I can't reproduce even when forcing it to write a netcdf3 file with engine="scipy"
i was able to reproduce the issue from a fresh environment:
mamba create -n test 'python=3.12' xarray scipy ipython distributed
In [6]: ds2 = xr.open_dataset('/tmp/test.nc')
In [7]: ds2.sel(y=[1])
Out[7]:
<xarray.Dataset> Size: 16B
Dimensions: (x: 2, y: 1)
Dimensions without coordinates: x, y
Data variables:
A (x, y) object 16B ...
In [8]: ds2.sel(y=[1]).to_netcdf('/tmp/ttest.nc')
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[8], line 1
----> 1 ds2.sel(y=[1]).to_netcdf('/tmp/ttest.nc')
File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/core/dataset.py:2298, in Dataset.to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute, invalid_netcdf)
2295 encoding = {}
2296 from xarray.backends.api import to_netcdf
-> 2298 return to_netcdf( # type: ignore # mypy cannot resolve the overloads:(
2299 self,
2300 path,
2301 mode=mode,
2302 format=format,
2303 group=group,
2304 engine=engine,
2305 encoding=encoding,
2306 unlimited_dims=unlimited_dims,
2307 compute=compute,
2308 multifile=False,
2309 invalid_netcdf=invalid_netcdf,
2310 )
File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/backends/api.py:1339, in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile, invalid_netcdf)
1334 # TODO: figure out how to refactor this logic (here and in save_mfdataset)
1335 # to avoid this mess of conditionals
1336 try:
1337 # TODO: allow this work (setting up the file for writing array data)
1338 # to be parallelized with dask
-> 1339 dump_to_store(
1340 dataset, store, writer, encoding=encoding, unlimited_dims=unlimited_dims
1341 )
1342 if autoclose:
1343 store.close()
File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/backends/api.py:1386, in dump_to_store(dataset, store, writer, encoder, encoding, unlimited_dims)
1383 if encoder:
1384 variables, attrs = encoder(variables, attrs)
-> 1386 store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims)
File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/backends/common.py:393, in AbstractWritableDataStore.store(self, variables, attributes, check_encoding_set, writer, unlimited_dims)
390 if writer is None:
391 writer = ArrayWriter()
--> 393 variables, attributes = self.encode(variables, attributes)
395 self.set_attributes(attributes)
396 self.set_dimensions(variables, unlimited_dims=unlimited_dims)
File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/backends/common.py:482, in WritableCFDataStore.encode(self, variables, attributes)
479 def encode(self, variables, attributes):
480 # All NetCDF files get CF encoded by default, without this attempting
481 # to write times, for example, would fail.
--> 482 variables, attributes = cf_encoder(variables, attributes)
483 variables = {k: self.encode_variable(v) for k, v in variables.items()}
484 attributes = {k: self.encode_attribute(v) for k, v in attributes.items()}
File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/conventions.py:795, in cf_encoder(variables, attributes)
792 # add encoding for time bounds variables if present.
793 _update_bounds_encoding(variables)
--> 795 new_vars = {k: encode_cf_variable(v, name=k) for k, v in variables.items()}
797 # Remove attrs from bounds variables (issue #2921)
798 for var in new_vars.values():
File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/conventions.py:196, in encode_cf_variable(var, needs_copy, name)
183 ensure_not_multiindex(var, name=name)
185 for coder in [
186 times.CFDatetimeCoder(),
187 times.CFTimedeltaCoder(),
(...)
194 variables.BooleanCoder(),
195 ]:
--> 196 var = coder.encode(var, name=name)
198 # TODO(kmuehlbauer): check if ensure_dtype_not_object can be moved to backends:
199 var = ensure_dtype_not_object(var, name=name)
File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/coding/times.py:972, in CFDatetimeCoder.encode(self, variable, name)
970 def encode(self, variable: Variable, name: T_Name = None) -> Variable:
971 if np.issubdtype(
--> 972 variable.data.dtype, np.datetime64
973 ) or contains_cftime_datetimes(variable):
974 dims, data, attrs, encoding = unpack_for_encoding(variable)
976 units = encoding.pop("units", None)
File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/core/variable.py:433, in Variable.data(self)
431 return self._data
432 elif isinstance(self._data, indexing.ExplicitlyIndexed):
--> 433 return self._data.get_duck_array()
434 else:
435 return self.values
File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/core/indexing.py:809, in MemoryCachedArray.get_duck_array(self)
808 def get_duck_array(self):
--> 809 self._ensure_cached()
810 return self.array.get_duck_array()
File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/core/indexing.py:803, in MemoryCachedArray._ensure_cached(self)
802 def _ensure_cached(self):
--> 803 self.array = as_indexable(self.array.get_duck_array())
File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/core/indexing.py:760, in CopyOnWriteArray.get_duck_array(self)
759 def get_duck_array(self):
--> 760 return self.array.get_duck_array()
File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/core/indexing.py:619, in LazilyIndexedArray.get_duck_array(self)
617 def get_duck_array(self):
618 if isinstance(self.array, ExplicitlyIndexedNDArrayMixin):
--> 619 array = apply_indexer(self.array, self.key)
620 else:
621 # If the array is not an ExplicitlyIndexedNDArrayMixin,
622 # it may wrap a BackendArray so use its __getitem__
623 array = self.array[self.key]
File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/core/indexing.py:1000, in apply_indexer(indexable, indexer)
998 return indexable.vindex[indexer]
999 elif isinstance(indexer, OuterIndexer):
-> 1000 return indexable.oindex[indexer]
1001 else:
1002 return indexable[indexer]
File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/core/indexing.py:342, in IndexCallable.__getitem__(self, key)
341 def __getitem__(self, key: Any) -> Any:
--> 342 return self.getter(key)
File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/coding/variables.py:72, in _ElementwiseFunctionArray._oindex_get(self, key)
71 def _oindex_get(self, key):
---> 72 return type(self)(self.array.oindex[key], self.func, self.dtype)
File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/core/indexing.py:342, in IndexCallable.__getitem__(self, key)
341 def __getitem__(self, key: Any) -> Any:
--> 342 return self.getter(key)
File ~/mambaforge/envs/test/lib/python3.12/site-packages/xarray/coding/strings.py:256, in StackedBytesArray._oindex_get(self, key)
255 def _oindex_get(self, key):
--> 256 return _numpy_char_to_bytes(self.array.oindex[key])
AttributeError: 'ScipyArrayWrapper' object has no attribute 'oindex'
In [9]: xr.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.12.2 | packaged by conda-forge | (main, Feb 16 2024, 20:54:21) [Clang 16.0.6 ]
python-bits: 64
OS: Darwin
OS-release: 23.4.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None
xarray: 2024.3.0
pandas: 2.2.1
numpy: 1.26.4
scipy: 1.13.0
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: 2024.4.1
distributed: 2024.4.1
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: 2024.3.1
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 69.2.0
pip: 24.0
conda: None
pytest: None
mypy: None
IPython: 8.22.2
sphinx: None
I tried "pip install xarray==0.20.1 scipy==1.7.1" and it removed the error.
Having the same error over here.
In the meantime using installing netCDF4
should make things work. I'll work on fixing this.
Note that I can reproduce this error even with scipy 1.14.1 and netCDF4 1.7.1 in the environment: https://github.com/euroargodev/argopy/issues/390
What happened?
Exception
ScipyArrayWrapper' object has no attribute 'oindex'
when trying to save dataset into netcdf file after selecting subset from dataset previously loaded from another netcdf file.What did you expect to happen?
No response
Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
Anything else we need to know?
No response
Environment