Closed ecurtiss closed 2 months ago
CatRom needs to support one-off reparamatrizations as well as bulk reparametrizations. Newton's method still performs best for one-off reparametrizations, but I have tried a variety of methods for bulk reparametrizations.
This is the existing reparametrization method. It uses a hybrid of Newton's method and bisection method. This method has no precompute, so it initially outperforms the rest of the methods, but it eventually gets surpassed given enough calls.
This uses a Chebyshev polynomial to interpolate the arc length function. The cheb is inverted by sampling points on its inverse via Newton-bisection and constructing a second cheb from the samples. For Newton, I am using the derivative of the original function, not the cheb. Finally, once the inverted cheb has been computed, reparametrizing amounts to evaluating the polynomial.
Same as option 2 but uses regula falsi to invert the cheb.
Same as option 2 but uses the ITP method to invert the cheb.
This is the existing PrecomputeArcLengthParams()
method. It uses option 1 to precompute a lookup table that maps arc lengths uniformly spaced in [0, 1] to their times. Once the lookup table has been computed, the values are linearly interpolated.
Precomputes a lookup table with uniformly-spaced points like option 5, but we interpolate the arc length function with a cheb and then sample points on its via regula falsi.
Precomputes a lookup table with non-uniformly spaced points. In particular, we use option 3 and then recycle the values of the inverted cheb on the Chebyshev grid as the lookup table. Since the Chebyshev grid clusters points near 0 and 1, the resulting lookup table has greater accuracy near the boundary, which is likely where a spline will have its greatest curvature (provided default constructor arguments are used).
This benchmark measures the time it takes to reparametrize an arc length after all precomputes have been performed. I created a CatRom with 50 splines and reparametrized 50 arc lengths per spline.
Here, the Chebyshev polynomials interpolated 8 grid points, and the lookup tables had 8 samples. We see that the lookup tables are about an order of magnitude faster than evaluating a cheb, which is an order of magnitude faster than Newton-bisection.
Here, I increased the cheb grid points to 16 and the LUT samples to 16. As expected, only the cheb evaluation got slower, but it is still very fast (< 1us).
This benchmark measures how long the precomputes take for options 2-7. I ran the precomputes on 20 CatRoms, each having 20 splines.
Here, the chebs had 8 grid points and the LUTs had 8 samples.
Here, the chebs had 16 grid points and the LUTs had 16 samples. I had to remove 4. Cheb (ITP)
for this test because it would have taken too much time.
3, 6, and 7 all sample a cheb's inverse via regula falsi, so this is clearly the fastest precompute. The choice between them will come down to accuracy.
This benchmark measures the full performance of precomputing (if necessary) and then bulk reparametrizing. I created one CatRom with 50 splines, precomputed, and then did some number of reparametrizations per spline.
5 reparametrizations per spline, 8 grid points per cheb, 8 samples per LUT
10 reparametrizations per spline, 8 grid points per cheb, 8 samples per LUT
20 reparametrizations per spline, 8 grid points per cheb, 8 samples per LUT
5 reparametrizations per spline, 16 grid points per cheb, 16 samples per LUT
10 reparametrizations per spline, 16 grid points per cheb, 16 samples per LUT
20 reparametrizations per spline, 16 grid points per cheb, 16 samples per LUT
With a little more testing, the tipping point where it becomes worth your time to precompute a lookup table is 8 samples per spline given 8 grid points/samples and 15 samples per spline given 16 grid points/samples.
It remains to measure each method's accuracy. To do this, I created a CatRom with 20 splines and performed 100 reparametrizations per spline. Then, using Newton-bisection as my source of truth, I charted the absolute error of some of the well-performing methods. Additionally, to extract a single error value, I also printed the sum of the absolute errors for each (L1 norm).
8 grid points per cheb, 8 samples per LUT
3. Cheb (regula falsi) 2.6796691331763727
5. LUT (Newton) 7.931484687275745
6. LUT (Cheb uniform) 7.973120910351782
7. LUT (Cheb non-uniform) 4.6878766571400385
16 grid points per cheb, 16 samples per LUT
3. Cheb (regula falsi) 0.2661398687654118
5. LUT (Newton) 2.377147364435644
6. LUT (Cheb uniform) 2.421018434349442
8. 7. LUT (Cheb non-uniform) 1.2552987303827043
We see that evaluating a cheb is most accurate, followed by a lookup table on the cheb grid, followed by uniformly-spaced lookup tables. Interestingly, both lookup tables with uniformly-spaced samples have the exact same accuracy.
We learned from the benchmarks that
As a result, we can axe options 2, 4, and 5, leaving 3, 6, and 7.
In terms of speed: 6 > 7 >> 3 In terms of accuracy: 3 >> 7 > 6.
However, I strongly believe that the reparametrization time on 7 can be made comparable to 6 by replacing a linear search with a binary search. Therefore, I am considering 6 = 7 in terms of speed, letting us rule out option 6. Finally, we have our conclusion:
Implemented in a6c9017ea0c3f13e43a5ae407db718d107e77b1f. I tried to err on the side of being verbose for the API, but I'm not convinced that this will be the final version.
Using this issue to document my findings on using Chebyshev interpolation to speed up arc length reparametrizations. My target audience is my future self.
For previous work on arc length parametrizations, see https://github.com/ecurtiss/CatRom/issues/3.