Numerical integration accuracy issue

kirillkh commented 10 years ago

Hi! I am using paper.js's source code as a reference to optimize my own code that deals with arc length. In particular, I was lacking Newton-Raphson in my inverse parameterization algorithm, and the one you use proved to be just the thing I needed.

But one thing is not working well for me. In Numerical.integrate() you do something I haven't seen in other places, namely you integrate between two t parameters a and b (as opposed to always setting a=0). I don't know enough math to figure out whether it is correct or not analytically, but in practice it introduces accuracy issues, demonstrated by the following code:

// verbatim copy of a private function in Curve.js
    var getLengthIntegrand = function (v) {
        // Calculate the coefficients of a Bezier derivative.
        var p1x = v[0], p1y = v[1],
        c1x = v[2], c1y = v[3],
        c2x = v[4], c2y = v[5],
        p2x = v[6], p2y = v[7],
        ax = 9 * (c1x - c2x) + 3 * (p2x - p1x),
        bx = 6 * (p1x + c2x) - 12 * c1x,
        cx = 3 * (c1x - p1x),
        ay = 9 * (c1y - c2y) + 3 * (p2y - p1y),
        by = 6 * (p1y + c2y) - 12 * c1y,
        cy = 3 * (c1y - p1y);
        return function(t) {
            // Calculate quadratic equations of derivatives for x and y
            var dx = (ax * t + bx) * t + cx,
            dy = (ay * t + by) * t + cy;
            return Math.sqrt(dx * dx + dy * dy);
        };
    };

    var ds =    getLengthIntegrand([0.000000,0.000000,100.000000,100.000000,200.000000,200.000000,-17.444029,-41.252811]);

    // produces output: 0.38959626747556086. expected output: 0
    paper.Numerical.integrate(ds, 0.5, 0.75, 16) - (paper.Numerical.integrate(ds, 0, 0.75, 16) -    paper.Numerical.integrate(ds, 0, 0.5, 16));

I would be happy to hear any input you have on this.

kirillkh commented 10 years ago

By the way, I use 32 terms in my implementation (as opposed to 16 in paper.js), and I still see this issue, which is why I think it is a bug and not merely approximation artifact (because approximation artifacts can be made arbitrarily small by increasing the number of terms).

kirillkh commented 10 years ago

Hmm, apparently it only happens when all the points are almost collinear like in my example above, and the curve has a very sharp angle.

lehni commented 10 years ago

Interesting! It's been a long while that I wrote that code. I'm not quite sure where the assumption about the range [a, b] came from anymore. The better fix then would be to split the curve at a, interpolate b to the range of the new curve (_b), and integrate over [0, _b]. Will investigate!

lehni commented 10 years ago

Sketch code for life testing:

http://sketch.paperjs.org/#S/jVRNb5wwEP0rI05sAgRYYL/US3Oq1PbSY7IHr212aXaBGIPWjfLfO2O+Vkqk1gIx+L15b5ixeHNKdpHO1vn1IjU/OZ7DK0HvDw/QSXVgurgAr2oDVQ4MalV0TEvI25LroiqhKOGxVZ0MfjfPZccUHKX+LsujPn0rtTwqVgr4MtPdbgFvzyXgQoNHdubtmfT0SaKLzPOCF7LUTe/2Vf4ppAIhrWuBLn0q+dTRFYW7p3DvYWxsHO29nsAHMEaQD+ByAuMeTAiMezAdwXoAM5IdwNUIMsI2cAcuGfiktIB7WNIOJfpU1WIgH4icWQjJ9z3ZhyjGLUwfiyHWcpa8EWBmdjPWzdy4GUs2k5uZ3Yx1MzduZnQzs9sosOshJXWrymlSrp4G9WFYry0TCufBQWJIbDuveUwN5JWCK9DwzSxCYxP0vS428g40lnnAnvQRHztCSxjLMiPLTCyzm1lDyT+YPgXNq9KuIFlBzRaUKqaPe8cn3f0RFQ2qfzyn7lMYhHZ5UxCFn4Xxp6EfrYIkScJ44/lJFMRpvI6i/cLaYv9qVYmWY2+qVtet3kIYLNebdJPF2SpZpWkWrrMA5LWWXEsxs/qalWwirLpmtVTBz/aCvebsHBR98Vq6ovEgxCtYpR5EGQ3/P8g9dzd5xP/2oJzJZcgURZ5jphXwba0IcDwX1VkG5+ro0pZncc+SKRH/NQcl2UtdoUPjbJ/2738B

lehni commented 10 years ago

Interesting... So I implemented the described splitting method in a sketch, and got almost the same result as the original approach (a difference of 1.4210854715202004e-14). Then I created a loop that just adds up line lengths of tiny steps along the curve to compare it to, and it appears that it is your approach that's leading to imprecise results ([0, b] - [0, a]), suggesting that you should probably integrate with a higher iteration count there. I will do more testing with higher counts, need to create the tables for that first.

http://sketch.paperjs.org/#S/hVVRb5swEP4rp7yMtIQCIUmbrC/rXiZt06RJe2mjyICTsFJDjcnqVf3vu7MNZGu7IiFsf5+/73x3Th5Hgt3x0XL0/ZarbD/yR1mV0/zsDA5cpkwVd5BVtYZqCwxqWRyY4rBtRaaKSkAh4KqVBx78bG7EgUnYcfWZi53afxKK7yQTOVwOdO8whscbAfigwRUrs7YkPbXn6MK32yIruFCNdfvAfxdcQs6Na4Eudiv51NEDCh+uw7WPY23G0dq3hMyBMYKZA6c9GFswITC24KwDawfOSdaBiw5khF3ACXhkMCGlMZzClFZo44SiGjtySuS5gZB8askTiGJcwu1dMMSaDpJHAkwPbtq46SM3bci6d9ODmzZu+shNd256cOsEVhaSXLVS9JXyVF+oZ8W6b1kusR4ZcBwS29RrKFMD20rCA1Dx9SBCZcvpvB4m8gQUhpliTuwo6zJCT64NS3cs3bP0amC5kL8wtQ+ae6m8nGRzSnZOW/P+cE/4pde2KEPxMJit7Cw1s8WshzPqZ1wU/JftbS/0Qz8K7Ru7dxItgiRJwvjCnyRREM/i8ygaO80DK1vMwqXVCvBO/DArXkfY9Ax7e5o2zYtDkXPPAj6wMbbzihKPX5DFbq/cVgrYS7F6bAxn4EVm1Eefk+jzS+h0e//XaJuBdyPQvJZV3mYYadWqulVLTNX0/GJ2MY/ni2Qxm83D83kA/KHmmeL5wLI2kjcR+tSs5jL42t5hh2SsDAprp7iX40lDH1Ifojn165tMZpgYXYZdV5U8KKud9458lnBNSmtUoQFbL9/5JoDuzDiO3wyGdcG8YBGjBeHrZdeBYC3iI4vpfy027hibV02maLJhxOhtrMm0M2kUr03PhvREbrVWlGnRluUQS0I0nNJt9GhN4QJb4ef9JaT0Pb00cv1dt0rxced+qzB8/DlwV6nY4k+MisbH1xCNUAj3Ef9j0SgmMm5YbpMNDgl0D5+fOVlCWQjO5L+JTV4sA1bYFNwl301fTKbFYkftpi9Rk7+pyRH1RuDfYio5u60pF81oeb1++gM=

lehni commented 10 years ago

Another update: So I extend the lookup tables in my local version and did the integration with 18 instead of 16 terms, and here the results. They further suggest that my analysis above is correct. Perhaps you've set up your tables incorrectly?

res1: [0, b] - [0, a]: 36.9097712760088
res2: [a, b]:          36.725511647691256
res3: [_a, _b]:        36.72551164769126
res4: linear:          36.726130387873255
res2 - res1: -0.18425962831754106
res3 - res2: 7.105427357601002e-15
res4 - res2: 0.0006187401819985894

kirillkh commented 10 years ago

I've been very busy last couple of weeks, but I will get to it eventually.

lehni commented 10 years ago

Alright, feel free to reopen with new information.

kirillkh commented 10 years ago

Well, I've looked into the accuracy of integration, and all your code is definitely correct in both cases. At the same time, there is something to learn from the case I provided. Namely, the Gaussian integration doesn't behave well in the presence of high curvature points inside the curve. The explanation is in the way Gaussian quadrature works: the integration grid is more dense near 0 and 1 (the ends of the interval) than in the middle. Which means that there will always be curves that will cause a significant loss of precision unless your algorithm is able to accomodate high curvature points. One way is to use an adaptive integration scheme. Another is to do analysis on each curve to determine which regions are smooth and then integrate each region separately. The latter method is more difficult to implement, but for me the resulting code works faster. With randomized testing, I went from 8 bits of precision for the same non-adaptive method you are using to 21 bits of precision (with 32 slices in both cases).

lehni commented 9 years ago

@kirillkh very interesting! Would you be willing to share that code?

kirillkh commented 9 years ago

The code is proprietary at this time, but I'll let you know if I decide to release it as open source eventually.

kirillkh commented 9 years ago

I outlined the ideas for the algorithm here: http://pomax.github.io/bezierinfo/#bezierinfo-comment-172 Since that happened almost a year ago, I can't remember what additional tweaks I made to the code after I wrote that comment, but it describes the gist of the method. The algorithm employs a geometrical trick to find an approximation to the highest-curvature point (peak), but in practice the approximation is so good that I never felt a need to improve upon it.

Now that I gave it a fresh gaze, I saw some typos and errors in the explanation, but the algorithm itself is solid and works great in practice.

paperjs / paper.js

Numerical integration accuracy issue #518