pydata / xarray

N-D labeled arrays and datasets in Python
https://xarray.dev
Apache License 2.0
3.62k stars 1.08k forks source link

interpolation of xarray do not preserve attributes #6423

Open lanougue opened 2 years ago

lanougue commented 2 years ago

What is your issue?

Hi all, Interpolation over an xarray variable do not preserve its attributes: Here is a minimal example

x=np.arange(100)
x=xr.DataArray(x, dims='x', coords={'x':x})
t=np.arange(40,60,0.5)
t=xr.DataArray(t, dims='t', coords={'t':t}, attrs={'units':'s'})
result = x.interp(x=t)

t coordinate in result do not have the t unit anymore

max-sixty commented 2 years ago

This does retain units on x:

In [9]: x=np.arange(100)
   ...: x=xr.DataArray(x, dims='x', coords={'x':x}, attrs={'units':'m'})
   ...: t=np.arange(40,60,0.5)
   ...: t=xr.DataArray(t, dims='t', coords={'t':t}, attrs={'units':'s'})
   ...: result = x.interp(x=t)

In [10]: result
Out[10]:
<xarray.DataArray (t: 40)>
array([40. , 40.5, 41. , 41.5, 42. , 42.5, 43. , 43.5, 44. , 44.5, 45. ,
       45.5, 46. , 46.5, 47. , 47.5, 48. , 48.5, 49. , 49.5, 50. , 50.5,
       51. , 51.5, 52. , 52.5, 53. , 53.5, 54. , 54.5, 55. , 55.5, 56. ,
       56.5, 57. , 57.5, 58. , 58.5, 59. , 59.5])
Coordinates:
    x        (t) float64 40.0 40.5 41.0 41.5 42.0 ... 57.5 58.0 58.5 59.0 59.5
  * t        (t) float64 40.0 40.5 41.0 41.5 42.0 ... 57.5 58.0 58.5 59.0 59.5
Attributes:
    units:    m

Which would you think we should retain?

headtr1ck commented 1 month ago

I would say that this is not a huge issue, but agree that it might be a nice feature to have.

Does it still happen with a current xarray?

lanougue commented 2 weeks ago

@max-sixty , in your example, what is kept is the attribute of the variable, not the attribute of the coordinate. A better explicit example is:

x = np.arange(100)
v = xr.DataArray(x, dims='x', coords={'x':x}, attrs={'units':'degree'})
v['x'].attrs.update({'units':'m'})
t = np.arange(40,60,0.5)
t = xr.DataArray(t, dims='t', coords={'t':t}, attrs={'units':'s'})
result = v.interp(x=t)

and result is:

xarray.DataArray

    t: 40

    array([40. , 40.5, 41. , 41.5, 42. , 42.5, 43. , 43.5, 44. , 44.5, 45. ,
           45.5, 46. , 46.5, 47. , 47.5, 48. , 48.5, 49. , 49.5, 50. , 50.5,
           51. , 51.5, 52. , 52.5, 53. , 53.5, 54. , 54.5, 55. , 55.5, 56. ,
           56.5, 57. , 57.5, 58. , 58.5, 59. , 59.5])

    Coordinates:
        x
        (t)
        float64
        40.0 40.5 41.0 ... 58.5 59.0 59.5

        units :
            s

        t
        (t)
        float64
        40.0 40.5 41.0 ... 58.5 59.0 59.5
    Indexes: (1)
    Attributes:

    units :
        degree

In the example above, we interp v defined over x coordinate (units being 'm') with t (units being 'm'). This, of course, do not have physical sense, but, from a numerical point of view, I think we should recover at least x or t unit in result. My example is even worse than expected since in result, we see that 't' looses its units and 'x' is now having unit of 't'...

keewis commented 2 weeks ago

I think that's to be expected: we interpolate x to the values of t, so if copying over the attributes does not make sense the entire operation does not make sense (or at least that should be the case for units, and is what we've been trying to enforce with pint-xarray – which reminds me to try and release that).

Also, the t in the result is the coordinate t from the variable t, which did not have any attrs to begin with.