Open timholy opened 6 years ago
Realized that @jlperla and @chiyahn are not watching this package, but their advocacy for the function-call syntax is relevant for the discussion here too.
I like it.
I'm not sure that the ScaledInterpolation
was originally intended to fit as well in with the others as you express it here, though. My intention with scaling (and extrapolation) as separate concepts from B-spline (and other) implementation schemes, was to make sure that as much as possible of that infrastructure can be used if/when e.g. Hermite splines (#6) are added. However, as you say it seems to fit very well in with the framework.
I've also been mulling secretly on a generic "tensorization scheme", which would allow for (admittedly probably much slower, but still) N-dimentional interpolations with any scheme for which there is a 1-dimensional implementation, where you basically create the N
th dimension of length n
by supplying n
N-1
-dimensional interpolation objects and a knot vector, also of length n
, specifying where along the N
th axis the inner interpolation objects belong. Interpolation in this scheme would work much the same as our current B-splines do, but with less performance magic: figure out one dimension at a time, gradually building up the point in N-dimensional space we're looking for. This is a big undertaking, though, so it's remained an abstract idea in my head for a long time...
However, I think it would be nice if we could build the APIs around such functionality (scaling, extrapolation, perhaps one day N-dimensionality) in a way that you can just build a dumb 1D interpolation scheme over a unit range, and get all the rest "for free", and if your interpolation scheme makes it easy, or someone wants it badly enough for performance reasons, you can override select parts of the functionality with something more optimized. Multiple dispatch as a paradigm should excel at this.
Actually, I think expand
in #226 basically does that.
https://github.com/JuliaMath/Interpolations.jl/pull/226#discussion_r209846461 seems to apply in more than one place, then.
itp = interpolate(A, BSpline(Linear(), 1:5), BSpline(Quadratic(Flat()), 0.5:4.5))
Would that imply that in order to create an interpolation one would have to explicitly specify the range (to know the knot vectors)?
Good question. I personally would favor some defaults, e.g., the equivalent of OnGrid
for odd orders (Linear
and Cubic
) and OnCell
for even orders (Constant
and Quadratic
). We could implement BSpline
as
struct BSpline{Order,V<:Union{AbstractVector,Nothing}}
order::Order
axis::V
end
and then replace a nothing
axis
field with the default by dispatching on it when running interpolate
. (You need to have the array object available to construct a range, so you know what the bounds are. So BSpline
can't provide the default on its own.)
This makes perfect sense to me, Tim :)
our current GriddedInterpolation is effectively a BSplineInterpolation{K} where K<:AbstractVector (the knots must be sorted)
One important distinction between the Gridded
and other interpolation types was up until now that
K
must be sorted in a Gridded
interpolation, but they may represent an irregular grid, i.e. grid = [1;2;5.1;5.4;6]
is valid.We talked several times about the difficulty of extending irregular grids to non-Gridded
types. Would that change in any way with your new proposal? I guess if you start focusing on indices, the importance of the value behind the index vanishes? Wild guess, really.
Plugging away at #226 has been the occasion for reconsidering the overall structure of the package. I've been slowly evolving towards one significant internal API change (introducing
WeightedIndex
) which hopefully you'll see soon as part of #226, so I won't bother describing it here. Here, I want to focus on a much more visible form of restructuring that may not be part of #226 but which I think is worth considering. This proposal stems from our accumulated experience thinking about indexing, arrays, and other container types elsewhere in the Julia ecosystem (CCing @mbauman and @andyferris, who have helped shape these views).We currently have 3 interpolation types:
BSpline
,Scaled
, andGridded
. (We also have extrapolation types, but that's orthogonal to the content of this proposal.) The core point of this proposal is that I believe we can:BSpline
)OnCell
vsOnGrid
distinctionby centralizing everything around the idea of
knot
vectors.As you all know, the knots are the places where we shift the parametrization of the underlying polynomial: when you cross a knot you end up accessing different elements of the underlying coefficient array. For an array axis spanning
1:n
, currently we have the following:OnGrid
: the knots are effectively positioned at1:n
OnCell
: the knots are effectively positioned atUnitRange(0.5, n+0.5)
Another important component is padding, which is currently expressed through a type-parameter of the appropriate
AbstractInterpolation
type. Now that we haveOffsetArrays
, a much cleaner approach (which you'll see in #226) is to use the coefficient array's axes to encode the padding: for example, quadratic interpolation requires an extra padding element on either end, and so the natural approach would be to index this axis of the coefficient array with integers that range from 0 ton+1
, where the 0th andn+1
st elements are the padding. By using an OffsetArray it's easy to keep everything in alignment.Let's examine some implicit elements of the scheme above a bit more deeply. Note that the knot-vector actually has two properties: each element has a position, but we also address it via an index. That is to say, the position of the
k
th knot isknotvec[k]
wherek
is the index. A helpful analogy is to a Dict, which has key/value pairs; here, the key is the index and the value is the position. If you think about it a bit, the important thing is that it's the indices of the knot vector that specify the corresponding indices of the coefficient array: if the positionx
at which you're trying to evaluate the interpolation lies betweenknotvec[k]
andknotvec[k+1]
, then for linear interpolation I'm going to be accessing the coefficient array at indicesk
andk+1
. (x
and the values ofknotvec
affect the weights in a continuous fashion, but the indices are determined discontinuously by thek
satisfyingknotvec[k] <= x <= knotvec[k+1]
.)Realizing that the knot vector's indices, rather than values, encode the alignment with the coefficient array, we come to notice that we're not really making good use of the knot-vector's values, at least for the core
BSpline
type. We do, however, formalize these more for theScaled
andGridded
types. Pursuing this just a bit, we come up with the following scheme:BSplineInterpolation
is really aBSplineInterpolation{K} where K<:AbstractUnitRange
ScaledInterpolation
is effectively aBSplineInterpolation{K} where K<:Union{StepRange,StepRangeLen}
GriddedInterpolation
is effectively aBSplineInterpolation{K} where K<:AbstractVector
(the knots must be sorted)Consequently, all three "top level types" become variants of one overall type with different knot vectors. Unification has some significant advantages, among them being that it lets me mix-and-match: I could have an interpolation object which has
p
axes withAbstractRange
-type knot vectors andq
axes withAbstractVector
-type knot vectors. Currently, if you have just oneAbstractVector
knot-vector, you have to useGriddedInterpolation
and use them for all of your axes. Looking up the index in a sortedAbstractVector
is anO(logn)
process, but it'sO(1)
forAbstractRange
, so being forced to use slow lookup for all of them is a bummer.The full (aka, most general offering the greatest degree of control) user-level API might look something like this:
As a concrete example, for a
5x4
arrayA
, our currentwould be replaced by
But we could also do fun things like
and access values as
itp(0.83mm, 5.2s)
. (This particular example represents a series of measurements taken at regular spatial positions but sampled non-uniformly in time; if the 2nd "time axis" had been anAbstractRange
it would have represented regular temporal sampling.) This makes more sense than ever, now that we're moving towards using()
rather than[]
for indexing at positions. We can still use[]
for looking upInteger
locations (and my upcomingWeightedIndex
type) that are divorced from the positional meaning of the knot vectors.That's pretty much it. Thoughts?