kb-press / ndsplines

Multi-dimensional B-splines
https://ndsplines.readthedocs.io
Other
28 stars 9 forks source link

Unstructured data fitting #33

Open sixpearls opened 5 years ago

sixpearls commented 5 years ago

The SciPy docs have a few of Dierckx's surface fit wrappers (SmoothBivariateSpline, LSQBivariateSpline, and their spherical equivalents) listed under the "unstructured data" sub-heading for 2-D Splines. It looks like Dierckx's general API for the 2D problems is to take 3 1-D arrays for the X, Y, Z sample data. The rectangular (structured data) version seems like it does something similar to #31.

For the methods that can handle unstructured data, it seems it is using a somewhat heuristic algorithm that starts with a least squares spline with knots only on the boundaries of the sample data, then iteratively:

  1. Add a knot to a dimension (location and dimension based on heuristic criteria of the residuals of current spline).
  2. Construct the least squares spline of the data for the new set of knots.
  3. If the fit improves by a certain threshold, keep the knot. If not, remove it and mark this region not to be tested again.

I say this method is heuristic because I don't believe there are any theoretical claims about the results of this algorithm, and about tuning another parameter the author suggests "the user should examine it graphically, e.g. by looking at its contour map."

When I first saw that SciPy was listing this for "unstructured data" I thought we should include similar functionality. I didn't realize it was a fairly specific algorithm. It is also fairly tightly coupled to the "smoothing" formulation used throughout his FITPACK work (i.e., the 2-D splines linked here and 1-D class-based and functional wrappers.)

At this point, I am leaning towards saying this is out of scope, at least for "version 1.0" or whatever we want to call the first submission/release. This means we aren't able to fully replace all of SciPy's spline functionality but we do get very close with a much more cohesive approach. We could, perhaps, show an example with similar functionality with user-provided knots.

It is possible we could add similar functionality to this project or spin out a new one for a future publication(s). Just extending this heuristic to N-D might be a publication. We may also want to compare this approach to other adaptive knot selection, like MARS (the scikit-learn package inexplicably renamed the algorithm to EARTH).

sixpearls commented 5 years ago

@ixjlyons What do you think about the scope of this for the first release/publication?

ixjlyons commented 5 years ago

I agree it seems out of scope and/or a good candidate for future work.

the scikit-learn package inexplicably renamed the algorithm to EARTH

Curiosity got the best of me. Apparently MARS is trademarked so they chose the nearest planet to it1 as an alternative. Seems like the original was an R implementation.

1Made that part up but seems plausible