pydata / xarray

N-D labeled arrays and datasets in Python
https://xarray.dev
Apache License 2.0
3.59k stars 1.08k forks source link

Adding non-dimension string coordinate gives different interpolation results #8456

Open RogierFloors opened 11 months ago

RogierFloors commented 11 months ago

What happened?

When I interpolate on data by taking a mean along the y dimension the presence of a non-dimension string coordinate makes a difference to the result. I would expect this non-dimension coordinate to be passive.

What did you expect to happen?

I would expect the results to be independent of the non-dimension coordinate string_coord being present or not. But when string_coord is present I get:

<xarray.DataArray (x: 2, y: 2)>
array([[0. , 0.1],
       [0. , 0.1]])
Coordinates:
  * x             (x) int64 0 1
  * y             (x) float64 0.05 0.05
    string_coord  (x) <U21 '0' '1'

whereas after I uncomment the line where I add the non-dimension string coordinate I get the expected result:

<xarray.DataArray (x: 2)>
array([0.005, 0.005])
Coordinates:
    y        (x) float64 0.05 0.05
  * x        (x) int64 0 1

Minimal Complete Verifiable Example

import xarray as xr
import numpy as np
data = np.array([[0.0 , 0.1 ],
    [0.0, 0.1]])
data = xr.DataArray(
    data,
    coords=dict(
        x=[0,1],
        y=[0,1]
        )
)
data["string_coord"] = data.x.astype("str")
data_mean = data.mean(dim="y")
interp_data = data.interp(y=data_mean, method="linear")
print(interp_data)

MVCE confirmation

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.11.0 | packaged by conda-forge | (main, Jan 14 2023, 12:27:40) [GCC 11.3.0] python-bits: 64 OS: Linux OS-release: 5.15.133.1-microsoft-standard-WSL2 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: C.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.8.1 xarray: 2023.10.1 pandas: 2.1.2 numpy: 1.24.4 scipy: 1.11.3 netCDF4: 1.6.2 pydap: None h5netcdf: 1.2.0 h5py: 3.8.0 Nio: None zarr: None cftime: 1.6.3 nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: 1.3.7 dask: 2023.11.0 distributed: 2023.11.0 matplotlib: 3.8.1 cartopy: 0.22.0 seaborn: 0.13.0 numbagg: None fsspec: 2023.10.0 cupy: None pint: 0.22 sparse: None flox: None numpy_groupies: None setuptools: 68.2.2 pip: 23.3.1 conda: 22.9.0 pytest: 7.4.3 mypy: None IPython: 8.17.2 sphinx: 3.5.3
welcome[bot] commented 11 months ago

Thanks for opening your first issue here at xarray! Be sure to follow the issue template! If you have an idea for a solution, we would really welcome a Pull Request with proposed changes. See the Contributing Guide for more. It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better. Thank you!

dcherian commented 10 months ago

@Illviljan do you have time to take a look please?