boostorg / histogram

Fast multi-dimensional generalized histogram with convenient interface for C++14
Boost Software License 1.0
315 stars 73 forks source link

add interpolator functor #149

Open HDembinski opened 5 years ago

HDembinski commented 5 years ago

People often want to interpolate histograms. Interpolating profiles is even more useful. Interpolation should be done via a functor, so that the interpolation can have state.

auto h = ... // histogram
auto interp = interpolator(h, interpolator::linear);
HDembinski commented 5 years ago

State may be needed for spline interpolation.

NAThompson commented 5 years ago

@HDembinski : There are tons of splines in Boost.Math. Presumably one of any of the functions in boost/math/interpolators would do?

HDembinski commented 5 years ago

@NAThompson Good point, I see that there were several added in the last two releases. Catmull-Rom splines and the vector-valued barycentric rational interpolation are candidates. Other interpolators only work for 1D data and are not suitable here.

The interpolators should be added in a way that at least linear interpolation can be used without requiring Boost.Math as a dependency.

HDembinski commented 5 years ago

On closer look, it seems like neither works. Both interpolate a multi-dimensional point as a function of the 1D variable. A histogram is the opposite. The interpolated value is 1D and the dependent variable is an ND point. So none of the interpolators in Boost.Math can be used.

HDembinski commented 5 years ago

Basically, interpolation for Histograms is R^n -> R instead of R -> R^n.

NAThompson commented 5 years ago

R^n->R interpolation is more difficult and it's been in the back of my head to add it for some time. Do you have an arbitrary pointset in R^n or will a uniform grid on compact subsets of R^n be sufficient?

Also, what are reasonable values of n for this problem?

HDembinski commented 5 years ago

Depending on the axis type for the histogram, it can be uniform, but in general will be non-uniform. Reasonable values of n are 1 to 6, but can be arbitrarily large in principle.

HDembinski commented 5 years ago

I agree, R^n -> R is way more difficult. Only linear interpolation is easy in any dimension, so that's definitely the first algorithm that will be implemented. I spend a lot of time reading up on fast interpolation a few years back for another project. I can recommend the famous Numerical Recipes http://numerical.recipes on the subject. The references in the SciPy docs are also helpful. https://docs.scipy.org/doc/scipy/reference/interpolate.html

NAThompson commented 4 years ago

I think that the Makima spline might be the way to go for this:

https://blogs.mathworks.com/cleve/2019/04/29/makima-piecewise-cubic-interpolation/

HDembinski commented 4 years ago

Thank you for this interesting blog post, I didn't know about the pchip and the makima methods, which both look very interesting.

HDembinski commented 4 years ago

pchip is interesting, because we often do not want the interpolation to leave the value range of the data, so example, interpolated values in a histogram should not become less than zero.

NAThompson commented 4 years ago

The main reason this is a good method is that it has a natural generalization to higher dimensions; see Akima's 1974 paper: https://doi.org/10.1145/360767.360779

HDembinski commented 4 years ago

Yes, that is definitely interesting.