espdev / csaps

Cubic spline approximation (smoothing)
https://csaps.readthedocs.io
MIT License
164 stars 27 forks source link

Interpolate on irregular grid #69

Closed sroener closed 6 months ago

sroener commented 1 year ago

Hi,

as far as I understand, CSAPS is able to work with univariate, multivariate (both depending x being 1dimensional) and ND-gridded data.

In my use case, I have an irregular grid with missing values in random positions. Is it possible to use CSAPS on that kind of data?

As an example, would CSAPS work on data generated like in the scipy.griddata example?

If irregular grids don't work, would it be feasible to use NaNs to fill the grid?

espdev commented 1 year ago

Hello, @sroener

You probably need to apply scipy.griddata linear or nearest interpolator firstly to your irregular data and second use csaps to smooth the interpolated data on regular grid (the same grid that you use in griddata).

sroener commented 1 year ago

Hi @espdev ,

thanks for the quick reply!

I might try that. Can you give me some guidance on how to handle "extrapolated" values?

From my understanding, griddata fills the regular grid with interpolated values inside the boundaries of known values and with a fixed value (NaN by default) outside the boundaries. Is csaps able to handle this/ignore NaN values?

If not, would it be possible to create a weight mask for the gridded data, setting all NaNs to a specific value (e.g. 0) and reduce their weights to 0? Can this be done by providing the weights as vectors in a similar shape of X?

edit: Weighting inputs is only possible for the whole vector and not a single value?

espdev commented 1 year ago

Currently csaps cannot work with NaN values. If the input data contains NaN values, then all values will be NaN in the output data. csaps only can extrapolate values outside the regular grid not fill NaN values inside the grid.

You can use nearest interpolation method in griddata function to produce the data without NaN, or use fill_value to fill NaN values in the interpolated data. I understand using fill_value is not very flexible and may not be applicable in some cases.

sroener commented 1 year ago

Ok, thanks for your explanation.

I still have a hard time understanding how the weights work and why it is not possible to attach weights to individual data points.

You mention in #24 that for the ND gridded interpolation, the algorithm performs univariate smoothing using the weight vector. In the tutorial, you show an example of weighting a univariate interpolation using weights for individual points.

Wouldn't it be possible to provide the weights as ND-array of the same shape as y and query the correspondig weight vectors similar to the Y and perform the respective univariate splines? Everything downstream of that should still be working as before.

espdev commented 1 year ago

In my comment I provide a code fragment showing the use of weight vectors in ND-grid smoothing. We cannot use weight surface because we apply weights to univariate splines by each dimension.

You can try to change the algorithm for ND-grid case to make possible to use weight surface instead of weight vectors. PRs are always welcome. :)

sroener commented 1 year ago

Thanks for the feedback. Might actually try that.

Do you have a preferred way of reaching out if questions regarding the code arise?

espdev commented 1 year ago

Do you have a preferred way of reaching out if questions regarding the code arise?

You can just ask here in this issue.