JuliaMath / Interpolations.jl

Fast, continuous interpolation of discrete datasets in Julia
http://juliamath.github.io/Interpolations.jl/
Other
523 stars 110 forks source link

Incorporation of functionality from GridInterpolations.jl #28

Open mykelk opened 9 years ago

mykelk commented 9 years ago

GridInterpolations.jl provides multivariate interpolation on a rectilinear grid with support for both multilinear and simplex interpolation. My lab does interpolation in an inner-inner loop for a dynamic programming application, and so quite a bit of effort was put into making it fast. It would be neat if this functionality can be migrated to Interpolations.jl without compromising on speed or the simplicity of the interface.

tomasaschan commented 9 years ago

Hi @mykelk! I'm sorry I haven't replied to this. I'm sure I at least started writing a reply to this somewhere, but now I can't find it and it obviously didn't get posted.

I tried to skim the paper you referred to from the readme, as well as one of its references, to better understand the math behind your implementation technique, but it was a little too opaque for me to get it without spending more time on it. I hope to be able to do so over the weekend, but if you have easy-to-digest resources that would make this a little easier they would be most welcome.

I've also taken a quick look at the code, and while I do appreciate the effort to make everything really fast, I feel that a little too much of the generic nature of Julia's type system has been sacrificed in the process. I think we could preserve most of, if not all, the speed while making the code much more general, with quite a small effort - it would mostly be stuff like removing type arguments from function definitions to allow other types of data than Float64.

I would also like to harmonize the API in your code with the API in Interpolations.jl before we merge them into a single package. This might require some changes to Interpolations.jl to make room for your functionality, but that's completely OK with me. Basically, I think the following is probably a good pattern to follow:

itp = interpolate(data, interpolation_type_definition)

where data is the array of y/z coordinates in 1/2D (and corresponding quantity for higher dimensions) and interpolation_type_definition is a type argument that defines the type of interpolation that should be performed (for the existing linear B-spline interpolation it could be e.g. BSpline(Linear(OnGrid()))). We can then wrap this in another method/type to scale the coordinate axes, and maybe add a separate argument for extrapolation.

What do you think?

mykelk commented 9 years ago
  1. Yes, I agree that the references are pretty hard to understand and implement. It actually took quite a bit of effort to figure out how to implement simplex interpolation. The output is similar to that produced by scipy, but our implementation is much faster and does not rely on the qhull library. @etotheipluspi can provide additional details.
  2. I agree that it would be nice to make it more generic and support data other than Float64 (without a performance penalty).
  3. I like your API. It might be nice to also support something like what we have in GridInterpolations.jl for non-uniform gridpoints.
  4. Another thing that is nice about GridInterpolations.jl is that you can quickly interpolate over new data. For our applications (dynamic programming for Markov decision processes) the data gets updated incrementally in a tight loop based on interpolations.
  5. I really like how you handle extrapolation.
tomasaschan commented 9 years ago
  1. After the fact, I didn't have time to do more reading on this. We'll see when I get to it :)
  2. As I said, I don't think it would take a huge effort - probably leaving out most of the type specs on function arguments will be a good start.
  3. I hope to support something like e.g. interpolate(x, y, data, interp_type) for 2D data, where the arguments before the interpolation type are N vectors or ranges of coordinates, and data<:Array{T,N}. This should be able to handle the non-uniform gridpoints scenario too. However, for any uniformly spaced set of coordinates (i.e. where all steps in a dimension are of equal length, even if steps in different dimensions are of different lenghts), it is quite trivial to re-scale the coordinate for interpolation outside of the actual interpolation code (like CoordInterpGrid in Grid.jl does today), which helps keeping the interpolation code simpler.
  4. This is a cool feature I wasn't aware of, but it's actually very easy to get that in Interpolations.jl as well, at least for constant and linear interpolation (which require no preprocessing, and thus won't suffer any performance loss if the data is changed). Today, we explicitly copy the input data to avoid that effect, but we could easily provide a function interpolate!(data, interp_type) that skips that copy operation and thus provides the same feature. (In fact, we might want to do that anyway, to allow for memory re-use for people who don't need to use the original array afterwards...)
  5. Thank you! =) I'm thinking that it could be done even more generally outside of the interpolation routines entirely; with a wrapper type similar to CoordInterpGrid we could separate extrapolation from interpolation which would make it a lot easier to re-use the extrapolation code for various types of interpolation schemes. It won't be implemented anytime soon (there are more pressing stuff to get to first) but I'll definitely make a mental note about it.
tomasaschan commented 9 years ago

Actually, now that I think about it, I think that your updating of the grid and a possible interpolate! function that works directly on the data will do different things - you seem to update the coordinates of the data points while keeping the data itself unchanged, while interpolate! would only react to changes on the data itself (since the coordinates are implicitly 1:size(data,dim)). Is this understanding correct?

mykelk commented 8 years ago

@tlycken, sorry, I just realized I didn't respond to your question. We change the data but keep the knots constant.

tomasaschan commented 8 years ago

@mykelk Check - that makes it quite intuitive to make it work as interpolate! currently does for constant and linear b-splines. (It won't work for b-splines of cubic or higher order, because of the need for prefiltering the data before evaluation).

Quite a lot has happened in this package since we had this discussion - for example, scaling (of uniform grids) and extrapolation has both been implemented as wrappers such as the ones I suggested above. There's also a Gridded interpolation type that solves basically the same problem as GridInterpolations.jl (I think) but in a different way (and maybe with stronger constraints; Gridded requires the knots to be on a rectangular grids, I don't know if you require that in GridInterpolations.jl). If you have a dataset where it makes sense to test, it would be interesting to see a benchmark between the two approaches.

EDIT: Of course, a PR which migrates functionality into Interpolations.jl is always welcome :)

timholy commented 8 years ago

This figure from the README suggests an advantage for Interpolations (I believe GI=GridInterpolations).

tomasaschan commented 8 years ago

This figure from the README suggests an advantage for Interpolations (I believe GI=GridInterpolations).

Ah, great! I wasn't aware that we already had these included in our benchmarks. @timholy is ahead of me as usual :)

@mykelk Given that benchmark, which suggests that Interpolations.jl already surpasses GridInterpolations, do you feel that there is still a reason to try to work your code into this library? If not, perhaps it's time to close this issue.

mykelk commented 8 years ago

@tlycken I think the only piece that is missing is simplex interpolation on a grid. To find the interpolated value at a particular point, regular multi-linear interpolation in d dimensions requires looking at 2^d values in your grid---but simplex only requires d + 1. I think it would involve adding a "Simplexed" type, kind of similar to "Gridded":

A = rand(8,20)
knots = ([x^2 for x = 1:8], [0.2y for y = 1:20])
itp = interpolate(knots, A, Simplexed(Linear()))
itp[4,1.2]
tomasaschan commented 8 years ago

Yes, that does seem valuable! A PR is most definitely welcome :)