pydata / xarray

N-D labeled arrays and datasets in Python
https://xarray.dev
Apache License 2.0
3.6k stars 1.08k forks source link

Multidimensional interpolate_na not working #9392

Open f930139 opened 2 months ago

f930139 commented 2 months ago

What is your issue?

Hi everyone, According to xarray documentation, I try to fill NaNs by interpolation using multidimensional interpolate_n u_interp = u.interpolate_na(dim = None, method = 'linear') but ends up ImplementedError: dim is a required argument. I also try u_interp = u.interpolate_na(dim = ['level', 'lat', 'lon'], method = 'linear') but ends up KeyError: ['level', 'lat', 'lon'].

The xarray version I have tried are: v0.19.0 (Jul 2021), v2023.6.0, v2024.5.0

Could anyone provide some thoughts on this? Thank you!

welcome[bot] commented 2 months ago

Thanks for opening your first issue here at xarray! Be sure to follow the issue template! If you have an idea for a solution, we would really welcome a Pull Request with proposed changes. See the Contributing Guide for more. It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better. Thank you!

keewis commented 2 months ago

you can only interpolate along a single dimension with that method: u.interpolate_na(dim="level", method="linear") should work. If you absolutely need multi-dim missing value interpolation on a geoscience dataset, you might want to try pyinterp.

f930139 commented 2 months ago

Hi keewis, Thanks for your quick response.

According to xarray interpolate_na documentation (https://docs.xarray.dev/en/stable/generated/xarray.DataArray.interpolate_na.html), it says dim is optional and can be None. So that's the reason I have this question. If dim=None raise error, why the documentation says this?

keewis commented 2 months ago

right, that looks like a documentation bug, and we might want to consider making dim a required parameter.

f930139 commented 2 months ago

I see. Thank you!

mchoblet commented 3 days ago

Encountered a similar problem, I often want to extrapolate nans at the border of a domain (nearest neighbor), and while interpolate_na works in one dimension, I was wondering if there would also be a possibility to add scipy.interpolate.NearestNDInterpolator to xarrays interpolate_na, which looks up the nearest neighbor in higher dimensions.

To use it, one has to do something like this with dataarrays:

from scipy.interpolate import NearestNDInterpolator mask = np.where(~np.isnan(da)) interp = NearestNDInterpolator(np.transpose(mask), da[mask]) filled_data = interp(*np.indices(da.shape))

(Sources: https://stackoverflow.com/questions/68197762/fill-nan-with-nearest-neighbor-in-numpy-array/68197821#68197821, https://stackoverflow.com/questions/65300732/how-to-fill-nan-with-nearest-non-nan-value-in-2-dimensions)