JuliaMath / Interpolations.jl

Fast, continuous interpolation of discrete datasets in Julia
http://juliamath.github.io/Interpolations.jl/
Other
523 stars 111 forks source link

why is BSpline() only working for regular grids? #131

Open floswald opened 7 years ago

floswald commented 7 years ago

I was wondering how hard it would be to support irregular grids for BSpline interpolation. I am aware that you have Gridded for this case, however, I often end up missing features of Gridded which BSpline supports. for example, I can't linearly extrapolate a 2D gridded interpolation, whereas I can do that with a BSpline(Quadratic(Free())). For many of my applications, irregular grids are necessary. If performance is worse on irregular grids, so be it.

julia> xmax, ymax = 8,8
(8,8)

julia> g(x, y) = (x^2 + 3x - 8) * (-2y^2 + y + 1)
g (generic function with 1 method)

# extrapolate from a 2D BSpline on regular grid

itp2g = interpolate(Float64[g(x,y) for x in 1:xmax, y in 1:ymax], (BSpline(Quadratic(Free())), BSpline(Linear())), OnGrid())
8×8 Interpolations.BSplineInterpolation{Float64,2,Array{Float64,2},Tuple{Interpolations.BSpline{Interpolations.Quadratic{Interpolations.Free}},Interpolations.BSpline{Interpolations.Linear}},Interpolations.OnGrid,(1,0)}:
 0.0    20.0     56.0    108.0    176.0    260.0    360.0    476.0
 0.0   -10.0    -28.0    -54.0    -88.0   -130.0   -180.0   -238.0
 0.0   -50.0   -140.0   -270.0   -440.0   -650.0   -900.0  -1190.0
 0.0  -100.0   -280.0   -540.0   -880.0  -1300.0  -1800.0  -2380.0
 0.0  -160.0   -448.0   -864.0  -1408.0  -2080.0  -2880.0  -3808.0
 0.0  -230.0   -644.0  -1242.0  -2024.0  -2990.0  -4140.0  -5474.0
 0.0  -310.0   -868.0  -1674.0  -2728.0  -4030.0  -5580.0  -7378.0
 0.0  -400.0  -1120.0  -2160.0  -3520.0  -5200.0  -7200.0  -9520.0

julia> ext = extrapolate(itp2g,(Linear(),Flat()))
8×8 Interpolations.Extrapolation{Float64,2,Interpolations.BSplineInterpolation{Float64,2,Array{Float64,2},Tuple{Interpolations.BSpline{Interpolations.Quadratic{Interpolations.Free}},Interpolations.BSpline{Interpolations.Linear}},Interpolations.OnGrid,(1,0)},Tuple{Interpolations.BSpline{Interpolations.Quadratic{Interpolations.Free}},Interpolations.BSpline{Interpolations.Linear}},Interpolations.OnGrid,Tuple{Interpolations.Linear,Interpolations.Flat}}:
 0.0    20.0     56.0    108.0    176.0    260.0    360.0    476.0
 0.0   -10.0    -28.0    -54.0    -88.0   -130.0   -180.0   -238.0
 0.0   -50.0   -140.0   -270.0   -440.0   -650.0   -900.0  -1190.0
 0.0  -100.0   -280.0   -540.0   -880.0  -1300.0  -1800.0  -2380.0
 0.0  -160.0   -448.0   -864.0  -1408.0  -2080.0  -2880.0  -3808.0
 0.0  -230.0   -644.0  -1242.0  -2024.0  -2990.0  -4140.0  -5474.0
 0.0  -310.0   -868.0  -1674.0  -2728.0  -4030.0  -5580.0  -7378.0
 0.0  -400.0  -1120.0  -2160.0  -3520.0  -5200.0  -7200.0  -9520.0

julia> ext[8,9]
-9520.000000000002

julia> ext[9,8]
-11781.0

# extrapolate from a 2D gridded interpolation (misses gradient implementation

itp2g = interpolate((1:xmax,1:ymax),Float64[g(x,y) for x in 1:xmax, y in 1:ymax], (Gridded(Linear()),Gridded(Linear())))
8×8 Interpolations.GriddedInterpolation{Float64,2,Float64,Tuple{Interpolations.Gridded{Interpolations.Linear},Interpolations.Gridded{Interpolations.Linear}},Tuple{Array{Int64,1},Array{Int64,1}},0}:
 0.0    20.0     56.0    108.0    176.0    260.0    360.0    476.0
 0.0   -10.0    -28.0    -54.0    -88.0   -130.0   -180.0   -238.0
 0.0   -50.0   -140.0   -270.0   -440.0   -650.0   -900.0  -1190.0
 0.0  -100.0   -280.0   -540.0   -880.0  -1300.0  -1800.0  -2380.0
 0.0  -160.0   -448.0   -864.0  -1408.0  -2080.0  -2880.0  -3808.0
 0.0  -230.0   -644.0  -1242.0  -2024.0  -2990.0  -4140.0  -5474.0
 0.0  -310.0   -868.0  -1674.0  -2728.0  -4030.0  -5580.0  -7378.0
 0.0  -400.0  -1120.0  -2160.0  -3520.0  -5200.0  -7200.0  -9520.0

julia> e = extrapolate(itp2g,(Linear(),Linear())
       )
WARNING: imported binding for e overwritten in module Main
8×8 Interpolations.Extrapolation{Float64,2,Interpolations.GriddedInterpolation{Float64,2,Float64,Tuple{Interpolations.Gridded{Interpolations.Linear},Interpolations.Gridded{Interpolations.Linear}},Tuple{Array{Int64,1},Array{Int64,1}},0},Tuple{Interpolations.Gridded{Interpolations.Linear},Interpolations.Gridded{Interpolations.Linear}},Interpolations.OnGrid,Tuple{Interpolations.Linear,Interpolations.Linear}}:
 0.0    20.0     56.0    108.0    176.0    260.0    360.0    476.0
 0.0   -10.0    -28.0    -54.0    -88.0   -130.0   -180.0   -238.0
 0.0   -50.0   -140.0   -270.0   -440.0   -650.0   -900.0  -1190.0
 0.0  -100.0   -280.0   -540.0   -880.0  -1300.0  -1800.0  -2380.0
 0.0  -160.0   -448.0   -864.0  -1408.0  -2080.0  -2880.0  -3808.0
 0.0  -230.0   -644.0  -1242.0  -2024.0  -2990.0  -4140.0  -5474.0
 0.0  -310.0   -868.0  -1674.0  -2728.0  -4030.0  -5580.0  -7378.0
 0.0  -400.0  -1120.0  -2160.0  -3520.0  -5200.0  -7200.0  -9520.0

julia> e[-1,1]
ERROR: MethodError: no method matching gradient_coefficients(::Type{Tuple{Interpolations.Gridded{Interpolations.Linear},Interpolations.Gridded{Interpolations.Linear}}}, ::Int64, ::Int64)

# create an invalid BSpline (on irregular grid)

itp2g = interpolate(Float64[g(x,y) for x in vcat(0.0,linspace(4.0,xmax,xmax-1)), y in 1:ymax], (BSpline(Quadratic(Free())), BSpline(Linear())), OnGrid())
8×8 Interpolations.BSplineInterpolation{Float64,2,Array{Float64,2},Tuple{Interpolations.BSpline{Interpolations.Quadratic{Interpolations.Free}},Interpolations.BSpline{Interpolations.Linear}},Interpolations.OnGrid,(1,0)}:
 0.0    40.0      112.0      216.0    352.0     520.0     720.0    952.0 
 0.0  -100.0     -280.0     -540.0   -880.0   -1300.0   -1800.0  -2380.0 
 0.0  -138.889   -388.889   -750.0  -1222.22  -1805.56  -2500.0  -3305.56
 0.0  -182.222   -510.222   -984.0  -1603.56  -2368.89  -3280.0  -4336.89
 0.0  -230.0     -644.0    -1242.0  -2024.0   -2990.0   -4140.0  -5474.0 
 0.0  -282.222   -790.222  -1524.0  -2483.56  -3668.89  -5080.0  -6716.89
 0.0  -338.889   -948.889  -1830.0  -2982.22  -4405.56  -6100.0  -8065.56
 0.0  -400.0    -1120.0    -2160.0  -3520.0   -5200.0   -7200.0  -9520.0 

julia> vcat(0.0,linspace(4.0,xmax,xmax-1))
8-element Array{Float64,1}:
 0.0    
 4.0    
 4.66667
 5.33333
 6.0    
 6.66667
 7.33333
 8.0    

julia> itp2g[1,1]
0.0

julia> itp2g[8,2]
-400.0
sglyon commented 7 years ago

Hey @floswald I would also really like this, but it would require fairly substantial changes.

The theoretical derivation of the equations for the BSpline family of routines rely heavily on the assumption that the grids are uniformly spaced with spacing equal 1. We (@timholy and @tlycken) have re-derived the general form of these equations with irregular grids, but only for constant and linear interpolation cases. In order to implement higher order interpolation on irregular grids someone would need to derive the filtering formulas and then implement them.

(if anyone knows that I've just given bad information please correct me)

timholy commented 7 years ago

Yes, that's right: someone who wants this will have to sit down and work out the mathematics.

wupeifan commented 7 years ago

Here's a systematic solution for this, with symbolic lib derived math and (bad) Julia implementation: Irregular grid + Any order BSplines. I put the code for one dimension here, but multi dimension is just a tensor product for BSplines right? I translated my code in Matlab, and from my previous experience related to my own work, these functions work smoothly. It might help, or might not, but if you guys are interested to put it in the library, I'd be happy to help.

module poly_BSplines

using SymEngine CDenseMatrix = SymEngine.CDenseMatrix

export GenBSplines, EvalBSplines

function GenBSplines(order, grid) grid_len = length(grid) x = symbols("x") ext_grid = [grid[1] ones(order); grid; grid[end] ones(order)] K = length(ext_grid) int_coef = CDenseMatrix(zeros(K, K)) int_coef[1:K - 1, 1:K - 1] = CDenseMatrix(eye(K - 1)) for i in 1:order int_coef_next = CDenseMatrix(zeros(K, K)) for j in 1:K - i - 1 if (ext_grid[j + i] != ext_grid[j]) int_coef_next[j, :] = (x - ext_grid[j]) / (ext_grid[j + i] - ext_grid[j]) int_coef[j, :] end if (ext_grid[j + i + 1] != ext_grid[j + 1]) int_coef_next[j, :] = int_coef_next[j, :] + (ext_grid[j + i + 1] - x) / (ext_grid[j + i + 1] - ext_grid[j + 1]) . int_coef[j + 1, :] end end int_coef = copy(int_coef_next) end

int_coef = int_coef[1:grid_len + order - 1, order + 1:grid_len + order].'
int_coef[end, end] = 1
res = zeros(grid_len * (grid_len + order - 1), order + 1)

for i in 1:grid_len - 1
    for j in 1:grid_len + order - 1
        if (int_coef[i, j] != 0)
            tmp = expand(int_coef[i, j])
            idx = (i - 1) * (grid_len + order - 1) + j
            for k in 1:order + 1
                res[idx, k] = N(subs(tmp, (x, 0)))
                tmp = expand((tmp - res[idx, k]) / x)
            end
        end
    end
end

res[end, 1] = 1
jac = zeros(grid_len * (grid_len + order - 1), order + 1)
for i in 1:order
    jac[:, i] = res[:, i + 1] * i
end

return res, jac

end

function EvalBSplines(x, order, grid, BSplines) K = length(x) T = length(grid) + order - 1

ind = zeros(Int64, K * T)
assi = linspace(1, T, T)
for i in 1:K
    ind[(i - 1) * T + 1: i * T] = (searchsortedlast(grid, x[i]) - 1) * T + assi
end

# Generate power series
xx = repmat(x, 1, T)
xx = repmat(reshape(xx.', K * T, 1), 1, order + 1)
xx[:, 1] = 1
xx = cumprod(xx, 2)
res = reshape(sum(xx .* BSplines[ind, :], 2), T, K)
res = res.'

return res

end

end

tomasaschan commented 7 years ago

@wupeifan If you'd like to share the underlying mathematics too, that would be really helpful for putting a solid implementation together :) We have some math docs here but (as so often in open source...) they're definitely not complete. Most useful to you might be the Mathematica notebook Spencer put together. Maybe you could put together something similar for your derivations?

Multidimensional spline interpolation should just be a tensor product of 1-dimensional ones, but it would be really helpful (to me, at least) to have some explicit formulation of how this works for irregular grids, as our current implementation pretty heavily assumes a regular grid.

And of course, if you'd like to tackle this yourself, a PR is most welcome! You definitely don't have to write a full-fledged implementation with all kinds of boundary conditions etc, but do take care with the interface (function and method names, etc) so it matches well with what already exists in the package, so that it is easy to keep building upon.

wupeifan commented 7 years ago

@tlycken Thanks! I think I already tackled it myself - the thing is my implementation in Julia is not that efficient. I'm willing to share the math behind. A PDF file is enough. Currently it is with natural BSplines - I guess you also know that boundary conditions can be altered through putting different construction points of BSpliens. Multidimensional spline is a tensor product of 1 dimensional ones. What I implemented above is for 1 dimensional but any order of irregular BSplines. In reality, I do tensor product myself every time (because I sometimes have other interpolations like Chebyshev to allow some freedom). Should I put the PDF here?

tomasaschan commented 7 years ago

Sure, you can either attach it in the issue, (if Github allows it) or put it on e.g. dropbox or google drive and link to it here.

The main idea behind Interpolations.jl is to do the tensor product stuff for you, and to generate code that does this very efficiently, while maintaining a flexible interface that can cover many use cases. For regular B-splines we've come quite far, but I've struggled (though admittedly not for very long) with formulating a precise mathematical formulation for how to do the same things for irregular splines.

I guess there are two parts of the mathematics that I would need to see to be able to help creating an efficient implementation:

wupeifan commented 7 years ago

How to do the implementation in 1D, including examples for at least one BC.

I'll show you that in the PDF file. What does BC mean? (bad English sorry)

`How to handle grid irregularity when evaluating the tensor product.

I did it by a dense matrix multiplication in the code above, which means that this implementation is about 100 times slower than its real complexity. Will show you when I have my file ready too.

tomasaschan commented 7 years ago

What does BC mean?

Boundary Condition. (Don't worry about it - English is not my first language either 😃 )

I did it by a dense matrix multiplication in the code above

What I'm really interested in is how to formulate the mathematics of multidimensional, irregular splines in a way that lets us do it efficiently. As an example, consider linear splines in 2D. It's quite easy to see that the interpolated value at a point (x,y) could be obtained as follows:

  1. Find the x_i and y_j such that x_i <= x < x_(i+1) and y_j <= y < y_(j+1) (i.e. find the "box" inside which the point must lie.
  2. Linearly interpolate in one dimension between e.g. (x_i, y_j) and (x_i, y_(j+1)) as well as (x_(i+1), y_j) and (x_(i+1), y_(j+1)). That yields interpolated values at two new points, (x_i, y) and (x_(i+1), y).
  3. Linearly interpolate again in one dimension between the two new points to obtain the value at (x, y).

I'm sure this can be generalized first to arbitrarily-dimensional splines, and then to arbitrarily-dimensional grids, to obtain an algorithm similar to what we do for regular grids with the BSplineInterpolation type today.

wupeifan commented 7 years ago

@tlycken Here it is, Please tell me whether it is clear or not... I have a bad habit of writing things in an abstract way. I couldn't put tex file here. Don't know how to circulate that.

B-Spline.pdf

For multi dimensional, I think it is a direct tensor product of all the interpolants right?

wupeifan commented 7 years ago

@tlycken Let me try to explain better for multidimensional irregular spline case. No matter the grid is evenly spaced or not, spline interpolation is always a weighted average of function values nearby, say 3 or 4 of them. Therefore, we still only need to find where does the variable lies in the box, then tensor product the weights across different dimensions. I don't know whether I clearly put it or not.

wupeifan commented 7 years ago

@tlycken If I am not explaining myself well I was wondering whether I can make a pull request? Just want to contribute to this.

tomasaschan commented 7 years ago

@wupeifan Pull requests are always welcome :)

If you do submit one, do try to make your implementation match the style of the APIs of the other interpolation methods (regular-grid BSplines being the most feature-complete example), to make the public API of the library as consistent and usable as possible. If you want help with this, we can probably continue the discussion around specific details in the PR itself.

wupeifan commented 7 years ago

@tlycken Cool thanks! I'll do that. I also noticed your latest open issue here, so just let me know who I can talk to when I implement this.

@fabgrei I guess you are at the same place as I am, but actually I don't know who you are. If you are interested in this, we can talk to see whether there's some synergy.

greimel commented 7 years ago

The math for gridded cubic splines can be found in the Numerical Recipes for Fortran book. The relevant pages are here.

@wupeifan I am afraid that I have other priorities at the moment. In my project, I call the Fortran code from numerical recipes. That works well enough for my purposes.

sglyon commented 7 years ago

Thank you @fabgrei -- that link is helpful. I think it should give us the info we need to implement the cubic splines here.

If anyone wants to tackle this (cc @wupeifan/@cc7768/anyone else who is interested) leave a message here and I'll give you some tips/pointers on how to get started.

sp94 commented 6 years ago

Hi @fabgrei and @sglyon,

I would like to contribute some cubic spline code. I have written cubic splines in other languages using the same numerical recipe that @fabgrei has linked, and I also have some Julia code for cubic Akima splines (http://www.alglib.net/interpolation/spline3.php#header5) whose behaviour I generally prefer.

I don't have experience contributing to open source projects yet, but I would like to try making a pull request to create Gridded(Cubic()) and Gridded(CubicAkima()). If you have any tips before I get started, @sglyon, they would be welcome.

sglyon commented 6 years ago

Thank you, @sp94, for your interest in contributing here! We'd love to have your help.

One tricky aspect of this library is that it heavily utilizes meta programming. If you are not familiar with Julia's meta programming facilities (e.g. Exprs and @generated functions), then I would suggest three things:

  1. Write a simple 1-dimensional implementation of your routines without trying to fit it in to the structure we have here. This will help make sure the implementation is correct and that you understand all the pieces.
  2. Read up on the Julia docs about metaprogramming
  3. Ask us for help/tips once you have the math/algorithm worked out and we can help you fit it into the library

On the other hand, if you are already familiar with meta programming then I'd recommend skipping step (2) (or only lightly brushing up on the docs) and starting step (3) a little sooner. I still think it is good idea to implement the routine outside of Interpolations.jl before hand to work through any mathematical uncertainties.

BenWhetton commented 6 years ago

Hi all, I was wondering if there has been any progress on this feature? I need it for one of my projects and I’d be happy to work on it. I have spent some time reading around Julia metaprogramming and trying to understand the existing code-base. It’s an interesting puzzle working out how it all hangs together! I've just found the math docs and I'm digging into them and the linked fortran recipe. If anyone has any other references that are relevant to this issue, could you point me in the right direction?

I been examining some of the generated code using macroexpand and either I'm failing to understand the code or there may be inconsistencies between the docstring and generated code variable names. The docstring for Cubic in b-splines/cubic.jl gives the formula for calculating y_i(x):

y_i(x) = cm p(x-i) + c q(x-i) + cp q(1- (x-i)) + cpp p(1 – (x-i)) ( I call this equation 1)

where

    p(δx) = 1/6 * (1-δx)^3
    q(δx) = 2/3 - δx^2 + 1/2 δx^3

and the values `cX` for `X ∈ {m, _, p, pp}` are the pre-filtered coefficients.

It seems to me that the generated getindex_impl code uses a different naming convention (see the bottom of the macroexpand() results at the end of this comment):

Have I got my wires totally crossed? If not, this makes it very difficult to interpret the code. I’d like to clear up my understanding before I start trying to understand higher dimensional cases.

P.S. I’m new to contributing to OS projects so any advice would be welcome.
Running the following macroexpand for the 1D Cubic BSpline indexing function

itp = interpolate([1.0, 2.0, 3.0], BSpline(Cubic(Line())), OnGrid())
macroexpand(Interpolations.getindex_impl(typeof(itp)))

Gives the following generated code:

begin 
    $(Expr(:meta, :inline)) 
    begin 
        x_1 = xs[1]
    end 
    inds_itp = indices(itp) 
    begin 
        begin  
            ix_1 = clamp(floor(Int, x_1), first(inds_itp[1]) + 0, last(inds_itp[1]) + -1) 
            fx_1 = x_1 - ix_1 
            ix_1 += 1 
            ixm_1 = ix_1 - 1 
            ixp_1 = ix_1 + 1 
            ixpp_1 = ixp_1 + 1
        end
    end 
    begin 
        begin  
            fx_cub_1 = cub(fx_1) 
            one_m_fx_cub_1 = cub(1 - fx_1) 
            cm_1 = SimpleRatio(1, 6) * one_m_fx_cub_1 
            c_1 = (SimpleRatio(2, 3) - sqr(fx_1)) + SimpleRatio(1, 2) * fx_cub_1 
            cp_1 = (SimpleRatio(2, 3) - sqr(1 - fx_1)) + SimpleRatio(1, 2) * one_m_fx_cub_1 
            cpp_1 = SimpleRatio(1, 6) * fx_cub_1
        end
    end 
    begin 
        $(Expr(:inbounds, true))
        # My comment: the naming convention here seems different from the docs
        ret = cm_1 * itp.coefs[ixm_1] + c_1 * itp.coefs[ix_1] + cp_1 * itp.coefs[ixp_1] + cpp_1 * itp.coefs[ixpp_1]
        $(Expr(:inbounds, :pop))
    end 
    ret
end
timholy commented 6 years ago

@BenWhetton, many thanks for your interest!

the pre-filtered coefficients cm, c, cp and cpp from equation 1 are itp.coefs[] in the generated code

Correct. Keep in mind the mathematics focus on just the interval of support (between two adjacent knots of the spline), but the underlying object has to support the entire domain of the array. So cm refers to "coefficient at current point minus 1," i.e., shifted one grid unit (on itp.coefs) to the left.

the p() and q() expressions from equation 1 are called cm, c, cp and cpp in the generated code

Wow. Looks like you are right. I can't explain how that happened. Yikes.

I’d like to clear up my understanding before I start trying to understand higher dimensional cases.

Really good idea. Best is to test all your ideas in 1d without the fancy apparatus of Interpolations and then generalize it to arbitrary dimensions and orders.

timholy commented 5 years ago

To all those who expressed interest, this should be much simpler now with the rewrite in #226. No need to figure out all the metaprogramming stuff.

btmit commented 4 years ago

To all those who expressed interest, this should be much simpler now with the rewrite in #226. No need to figure out all the metaprogramming stuff.

Checking in to see if anyone had a chance to look at this one since #226 was completed.

icweaver commented 4 years ago

^bump

As a code-illiterate astronomer, I just wanted to thank everyone who has been working on this =D

sp94 commented 4 years ago

Hi, I tried to address this several years ago but I wasn't able to get my head around the metaprogramming at the time. I ended up writing my own snippet which may be helpful to some people landing on this thread

https://github.com/sp94/CubicSplines.jl

icweaver commented 4 years ago

Hi, I tried to address this several years ago but I wasn't able to get my head around the metaprogramming at the time. I ended up writing my own snippet which may be helpful to some people landing on this thread

https://github.com/sp94/CubicSplines.jl

Hot dang this community is awesome. Thank you, can't wait to try it out!

icweaver commented 4 years ago

Works like a charm, thanks @sp94!

For the particular use case I have (essentially trying to replicate this), it would be amazing if we could also do cubic B-splines on irregular grids in Julia. I just noticed that you already discussed this earlier in the thread, so here's hoping!

jmsull commented 2 years ago

Bumping this to signal there is still interest. I have been using cubic B-splines for a while and recently needed to use an irregular grid but was unable to do so.

mkitti commented 2 years ago

Bumping this to signal there is still interest. I have been using cubic B-splines for a while and recently needed to use an irregular grid but was unable to do so.

The main reason is that no one has implemented it yet.

Essentially the scaling would need to be adjusted so that the derivatives match at the knots for BSplines of higher order than linear. A question is how to do this in a performant manner since the basis for this package is to optimize the performance when applied to unit grids or scalings thereof.

Another reason is that that there are many methods for implementing irregular grids in multiple dimensions.

There is also https://github.com/eljungsk/ScatteredInterpolation.jl