Open mathisc opened 3 years ago
Thanks @mathisc we use scipy by default and that's what it does...
In [6]: from scipy.interpolate import interp1d
In [7]: f = sp.interpolate.interp1d([0, 1, 2], [np.nan, 0, 1])
In [8]: f([0, 1, 2])
Out[8]: array([nan, nan, 1.])
Can you file an issue over at scipy?
EDIT: I guess we could skip interpolating when output coordinate values are a subset of the input coordinate values.
There does not seem a way to force the numpy interpolator for the 1D case:
(that could be a work around)
Thanks, Here is the ticket on Scipy's bug tracker : https://github.com/scipy/scipy/issues/14531
FWI this is what I ended up doing (which feels like work in both cases, interpolation needed or not):
interpolated_ds = ds.interp(time=ds.times)
reindexed_ds = ds.reindex(time=ds.times)
ds = interpolated_ds.fillna(reindexed_ds)
It is probably not the most efficient in terms of operation but it gets the job done.
Thanks for your feedback ;)
Interestingly this was fixed upstream according to the linked issue above, but the behavior persists in Xarray:
ds.interp(time=ds.time)
# array([[nan, nan, 1.]]) # (unexpected)
ds.interpolate_na(time=ds.time)
# array([nan, 0., 1.])
from scipy.interpolate import interp1d
f = interp1d([0, 1, 2], [np.nan, 0, 1])
f([0, 1, 2])
# array([nan, 0., 1.])
xarray: 2024.2.0
pandas: 2.2.1
numpy: 1.26.3
scipy: 1.12.0
Hey all, When running
dataset.interp(time=dataset.time)
fills withnp.nan
if one of the neighbor is anp.nan
even when interpolation is not actually needed.Here is the sample code to reproduce the issue :
Please note that numpy does not have the same behavior :
Is that an intended behaviour for xarray? If so, does this mean that I first have to check if an interpolation is needed instead of doing it no matter what (and use
reindex
instead ofinterp
if it is not needed) ?(this will be kind of tricky if interpolation is needed for certain values and some not...)
Thanks for your help ;)
Environment:
Output of xr.show_versions()
INSTALLED VERSIONS ------------------ commit: None python: 3.8.5 (default, Jul 28 2020, 12:59:40) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.8.0-7642-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.0 libnetcdf: 4.7.4 xarray: 0.18.2 pandas: 1.2.4 numpy: 1.19.4 scipy: 1.6.0 netCDF4: 1.5.6 pydap: None h5netcdf: 0.8.1 h5py: 3.1.0 Nio: None zarr: None cftime: 1.3.0 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2021.01.0 distributed: 2021.01.0 matplotlib: 3.4.2 cartopy: None seaborn: None numbagg: None pint: None setuptools: 57.4.0 pip: 20.2.4 conda: None pytest: None IPython: 7.19.0 sphinx: None