pydata / xarray

N-D labeled arrays and datasets in Python
https://xarray.dev
Apache License 2.0
3.59k stars 1.08k forks source link

Interpolation with multiple mutlidimensional arrays sharing dims fails #4463

Open aulemahal opened 4 years ago

aulemahal commented 4 years ago

What happened: When trying to interpolate a N-D array with 2 other arrays sharing a common (new) dimension and with one (at least) being multidimensional fails. Kinda a complex edge case I agree. Here's a MWE:

da = xr.DataArray([[[1, 2, 3], [2, 3, 4]], [[1, 2, 3], [2, 3, 4]]], dims=('t', 'x', 'y'), coords={'x': [1, 2], 'y': [1, 2, 3], 't': [10, 12]})
dy = xr.DataArray([1.5, 2.5], dims=('u',), coords={'u': [45, 55]})
dx = xr.DataArray([[1.5, 1.5], [1.5, 1.5]], dims=('t', 'u'), coords={'u': [45, 55], 't': [10, 12]})

So we have da a 3D array with dims (t, x, y). We have dy, containing the values of y along new dimension u. And dx containing the values of x along both u and t. We want to interpolate with:

out = da.interp(y=dy, x=dx, method='linear')

As so to have a new array over dims t and u.

What you expected to happen: I expected (with the dummy data I gave):

xr.DataArray([[2, 3], [2, 3]], dims=('t', 'u'), coords={'u': [45, 55], 't': [10, 12]})

But instead it fails with ValueError: axes don't match array.

Full traceback:

```python3 --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in ----> 1 a.interp(y=y, x=x, method='linear') ~/Python/xarray/xarray/core/dataarray.py in interp(self, coords, method, assume_sorted, kwargs, **coords_kwargs) 1473 "Given {}.".format(self.dtype) 1474 ) -> 1475 ds = self._to_temp_dataset().interp( 1476 coords, 1477 method=method, ~/Python/xarray/xarray/core/dataset.py in interp(self, coords, method, assume_sorted, kwargs, **coords_kwargs) 2691 if k in var.dims 2692 } -> 2693 variables[name] = missing.interp(var, var_indexers, method, **kwargs) 2694 elif all(d not in indexers for d in var.dims): 2695 # keep unrelated object array ~/Python/xarray/xarray/core/missing.py in interp(var, indexes_coords, method, **kwargs) 652 else: 653 out_dims.add(d) --> 654 result = result.transpose(*tuple(out_dims)) 655 return result 656 ~/Python/xarray/xarray/core/variable.py in transpose(self, *dims) 1395 return self.copy(deep=False) 1396 -> 1397 data = as_indexable(self._data).transpose(axes) 1398 return type(self)(dims, data, self._attrs, self._encoding, fastpath=True) 1399 ~/Python/xarray/xarray/core/indexing.py in transpose(self, order) 1288 1289 def transpose(self, order): -> 1290 return self.array.transpose(order) 1291 1292 def __getitem__(self, key): ValueError: axes don't match array ```

Anything else we need to know?: It works if dx doesn't vary along t. I .e.: da.interp(y=dy, x=dx.isel(t=0, drop=True), method='linear') works.

Environment:

Output of xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.8.5 | packaged by conda-forge | (default, Jul 31 2020, 02:39:48) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 3.10.0-514.2.2.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_CA.UTF-8 LOCALE: en_CA.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.4 xarray: 0.16.2.dev9+gc0399d3 pandas: 1.0.3 numpy: 1.18.4 scipy: 1.4.1 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: 1.1.3 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: 1.1.4 cfgrib: None iris: None bottleneck: 1.3.2 dask: 2.17.2 distributed: 2.23.0 matplotlib: 3.3.1 cartopy: 0.18.0 seaborn: None numbagg: None pint: 0.11 setuptools: 49.6.0.post20200814 pip: 20.2.2 conda: None pytest: None IPython: 7.17.0 sphinx: None
fujiisoup commented 4 years ago

Hi @aulemahal

I think you want to interpolate along tas well as x, and y. If so, you can do

In [8]: da.interp(t=dx['t'], y=dy, x=dx, method='linear')                       
Out[8]: 
<xarray.DataArray (t: 2, u: 2)>
array([[2., 3.],
       [2., 3.]])
Coordinates:
  * t        (t) int64 10 12
    y        (u) float64 1.5 2.5
    x        (t, u) float64 1.5 1.5 1.5 1.5
  * u        (u) int64 45 55

If not, this fails as dx['t'] and da['t'] do not match each other. The error message can be improved. A contribution is welcome ;)

shoyer commented 4 years ago

I think this gives the correct result?

>>> da.interp(t=da['t'], y=dy, x=dx, method='linear')
<xarray.DataArray (t: 2, u: 2)>
array([[2., 3.],
       [2., 3.]])
Coordinates:
  * t        (t) int64 10 12
    y        (u) float64 1.5 2.5
    x        (t, u) float64 1.5 1.5 1.5 1.5
  * u        (u) int64 45 55

My general thought is that if an axis (like t in this case) is omitted, then should be equivalent to indexing with the existing coordinate. That is how normal indexing in xarray worked, so interpolation should work the same.

aulemahal commented 4 years ago

Oh! I should have thought of this. The error message is indeed unclear... I'll close the issue as the problem is solved, but I'll note that:

da.sel(y=dy, x=dx, method='nearest')

does work, without the need to explicitly pass the shared dimension t. This is why I expected that interp would work the same.

dcherian commented 4 years ago

The error message is indeed unclear.

reopening so that we can fix this.

shoyer commented 4 years ago

I'll note that:

da.sel(y=dy, x=dx, method='nearest')

does work, without the need to explicitly pass the shared dimension t. This is why I expected that interp would work the same.

I think interp should work in this case, too!

Let's keep this issue open to track that.