Assumptions for Self Intersections of Curves are not fully reliable

iconexperience commented 9 years ago

When checking for self intersection of a curve, the algorithm currently follows these steps:

Check if the handles or negative handles of the curve intersect. It is assumed that curves can only self-intersect if this condition is fulfilled.
Split the curve at t=0.5 and check if the two curve parts intersect. A curve that self intersects always forms a loop, and the point at t=0.5 is assumed to always lie within the loop.

This works for the vast majority (maybe more than 99.9%) of all curves. But there are actually cases, where both assumptions are not true.

Here is an example.

As you can see there is a small loop, but the handles do not intersect and the magenta colored point at t=0.5 is far outside of the loop.

Note that this issue is currently merely for documentation purpose. I do not think that it causes problems in real world applications, but finally to have a perfect algorithm, this issue should be worked on.

lehni commented 9 years ago

That's really useful, thanks! I'd like to get this fixed either way.

iconexperience commented 9 years ago

Here is an approach for a replacement of the t=0.5 assumption.

I noticed that for self intersecting curves, if you calculate the velocity extrema (thanks to @kuribas for clarification), one of these points always lies inside the loop. A cubic curve can have zero to three of these extrema and it seems that in cases when the t=0.5assumption fails, there is only one extremum, which will be inside the loop. We can use this to find the best point for splitting the curve.

Note: Unfortunately this approach will not work, there can be more than one extrema if t=0.5is outside the loop (see newer comment). But velocity extrema could still be useful for reliable finding the split location.

Calculate velocity extrema
If there is no extremum, there is no self intersection
If there is only one extremum, split curve at the extremum and check if curve parts intersect
If there are more than one extrema, split the curve at t=0.5 and check if curve parts intersect

Unfortunately this approach is slower than the current, not fully reliable approach. Also, I cannot prove that this will cover all cases.

Please use this Sketch to see the velocity extrema in action (drag handles to create self intersecting curves).

lehni commented 9 years ago

Great, thanks! Before checking for self-intersection, we have another check in place that looks at the handles to avoid curves that cannot have self-intersections based on their handle configuration. How many false positives does this check give us for curves that are checked for self-intersections and don't have any? If it's not many, then I don't think it's an issue that this approach is slower. Also, how can it be faster than what we currently have? That's pretty much impossible : )

lehni commented 9 years ago

BTW: Instead of filtering out the roots that are out of bounds, why don't you include the min, max arguments to Numerical.solveCubic() to not include them in the first place?

iconexperience commented 9 years ago

Because I think it is relevant to know how many roots there are, even for t < 0 and t > 1. But I am not entirely sure about this. The whole approach is a bit shaky at the moment, but it may turn out to be an easy solution.

iconexperience commented 9 years ago

The problem is, that the first check is not entirely reliable (see first post), so we cannot use it to cover all cases. Also for the current implementation the false positives are quite a few, including many normal curves like this one (handles intersect, curve is far from self intersecting): If we find a better (tighter) first check which covers all cases, the performance of the new approach would not be an issue anymore. And I am pretty sure such a check exists, we just have to find it :smile:

iconexperience commented 9 years ago

I read somewhere that one condition for a self intersection is that the curve must have a horizontal and a vertical tangent. I do not know if this approach helpful, but just want to add it to the collection of ideas for preconditions of a self intersection.

lehni commented 9 years ago

Well, you're not doing anything with the number of roots found before the filtering, so I guess we might as well filter them straight ahead : ) So what you used to call a 'peak' in the offestting code is actually a velocity extreme, as pointed out by @kuribas. This sounds like a useful method to have on Curve, so it can be shared by both modules later.

lehni commented 9 years ago

Speaking of your offsetting code, the getTangentT() method might be of use to find these horizontal and vertical tangents? The problem right now is that it does not handle vertical tangents, due to a division by vTan.x, which will be zero. In that case, we'd have to solve for y instead of x I guess...

lehni commented 9 years ago

This version should be fine, I think.

iconexperience commented 9 years ago

There is a very simple way to check if vertical or horizontal tangents exist. In the Curve.evaluate() function the tangent is calculated as

x = (3 * ax * t + 2 * bx) * t + cx;
y = (3 * ay * t + 2 * by) * t + cy;

We want to know if x and y can be zero. For x=0 we get

0 = 3*ax*t^2 + 2*bx*t + cx

We can either calculate the roots and see if there are any for 0<t<1 or even take a shortcut and simply check if

4*bx*bx >= 12*ax*cx

which is the condition for existance of real roots. Here is the function.

var checkCurveIntersectionPossible = function(curve) {
    var v = curve.getValues();
    var p1x = v[0], p1y = v[1],
        c1x = v[2], c1y = v[3],
        c2x = v[4], c2y = v[5],
        p2x = v[6], p2y = v[7];

    var cx = 3 * (c1x - p1x),
        bx = 3 * (c2x - c1x) - cx,
        ax = p2x - p1x - cx - bx,
        cy = 3 * (c1y - p1y),
        by = 3 * (c2y - c1y) - cy,
        ay = p2y - p1y - cy - by;

    return (4*bx*bx >= 12*ax*cx) && (4*by*by >= 12*ay*cy);    
}

And here it is in action (drag handles, green means self intersection possible). Unfortunately there are many false positives, but it could be combined with another simple condition.

iconexperience commented 9 years ago

Sorry, I was wrong. If t=0.5 is outside the loop, there can be several extrema:

Magenta is the location at t=0.5, the others are velocity extrema. Aaargh.

kuribas commented 9 years ago

You can test if the derivative goes through more than 180 degrees, which is required for a loop. It may give false positives though.

kuribas commented 9 years ago

For example test if the origin is inside the convex hull of the hodograph (derivative curve).

kuribas commented 9 years ago

There is a formula for determining when a cubic has a loop here, though I don't really understand all the math behind it: http://www.msr-waypoint.net/en-us/um/people/cloop/LoopBlinn05.pdf

kuribas commented 9 years ago

Then split where the tangent is halfway, which can be found by solving a quadratic equation.

lehni commented 9 years ago

@iconexperience If there are more than one extrema, take the middle one?

iconexperience commented 9 years ago

@lehni Unfortunately the middle one can be outside of the loop. We should really find a fast first condition (or two conditions) that will filter out as many negatives as possible. Then we should be able to find the correct one of the two or three extrema and split there. With @kuribas suggestions above we should be able to find a solution. But first I want to tackle #777

lehni commented 9 years ago

And here a version that can be integrated into the library.

iconexperience commented 9 years ago

There has been a lot of progress and a possible solution in the comments to issue #788

iconexperience commented 9 years ago

@lehni I think fixing this issue should also include a little bit of refactoring. Just like PathItem, the way to find a self intersection of a curve should be mycurve.getIntersections(null). All the code for finding the self intersection, including the condition if a self intersection can exist, should be moved to Curve.js. This would make things cleaner, easier to read and there would be an API for getting a self intersection of a curve at no extra cost. What do you think?

iconexperience commented 9 years ago

This is a summary of our findings and the proposed fix for this issue. I hope it makes the new apporach easiert to understand. If you think something is missing or should be changed, please write a comment below.

Our algorithm for finding self intersections has four steps:

Apply a simple and fast condition to check if a curve can possibly have a self intersection. This condition must not produce any false negatives and the number of false positives should be as low as possible.
Find a point that, if the curve has a self intersection, is guaranteed to lie on the loop.
Split the curve at the point found in 2.
Locate the intersection between the two part of the split curve. If there is no intersection between the two curve parts, the curve has no self intersection.

Steps 3 and 4 work well in the original implementation, but for step 1 and 2 the assumtions were not fully reliable.

Condition for self intersection

It turned out that the original condition for determining whether a curve can have a self intersections produced false negatives. Therefore another condition needed to be found.

In issue #788 we found by observation that the sign of a curve's edge sum changes its sign when or just before a self intersection appears and the handles are on the same side of the line that goes through the curve's end points. This behaviour can be used as a very simple and fast test for a possible self intersection, but it covers only the cases when both handles are on the same side (and even then there are false positives).

Additionally we can use the fact that a cubic bezier curve with a self intersection never has an inflection point (see http://pomax.github.io/bezierinfo/#canonical for more information). In other words, if we find an inflection point on a curve, it cannot have a self intersection. Using this property of cubic bézier curves still results in a certain number of false positives (curves that have handles on opposite sides, have no inflection points, but also have no self intersections), but these cases are rare in real world scenarios.

The two ideas are used in the following algorithm for deciding whether a curve can have a self intersection:

if (handles are on same side) {
  if ((handles are on left side && edgeSum < zero)) or (handles are on right side && edgeSum > zero)) {
    curve can have intersection
  } else {
    curve has no self intersection
  }
} 
if (curve can have self intersection) {
  if (curve has no inflection points) {
    curve can have self intersection
  } else {
    curve has no self intersection
  }
}

If the result of these tests is curve has no self intersection we can stop, otherwise we continue with our algorithm.

Calculating the edge sum

The calculation of the edge sum is rather simple. For the values vof a curve it can be done like this.

edgeSum = (v[0]-v[2]) * (v[3]+v[1]) 
          + (v[2]-v[4]) * (v[5]+v[3])
          + (v[4]-v[6]) * (v[7]+v[5]) 
          + (v[6]-v[0]) * (v[1]+v[7]);

This calculation can be simplified nicely to

edgeSum = (v[0] - v[4]) * (v[3] - v[7]) + (v[2] - v[6]) * (v[5] - v[1]);

Testing for inflection points

The inflection points of a curve are found where the cross product between the first and is zero, or in other words, where the acceleration and velocity have the same direction. The following is taken from Adrian Colomitchi's great website (http://www.caffeineowl.com/graphics/2d/vectorial/cubic-inflexion.html) on bezier curves.

The cubic bezier curve through points P1, P1 with control points C1, C2 can be written as

P = (1-t)^3*P1 + 3*(1-t)^2*t*C1 + 3*(1-t)*t^2*C2 + t^3*P2

The first and second derivatives can be written as

P' = 3*t^2*(P2-3*C2+3*C1-P1) + 6*t*(C2-2*C1+P1) + 3*(C1-P1)
P" = 6*t*(P2-3*C2+3*C1-P1) + 6*(C2-2*C1+P1)

This is simplified by introducing the following terms:

a = P2 - 3*C2 + 3*C1 - P1`
b = C2 - 2*C1 + P1
c = C1 - P1

P' = 3*(a*t^2 + 2*b*t + c)
P" = 6*(a*t + b)

Using this simplification the cross product between the first and second derivative looks like this:

P'×P" = 18*(ax*t^2 + 2*bx*t + cx)*(ay*t + by) - 18*(ay*t^2 + 2*by*t + cy)*(ax*t + bx)
      = 18*(ax*ay*t^3 + ax*by*t^2 + 2*ay*bx*t^2 + 2*bx*by*t + ay*cx*t + by*cx 
            - ax*ay*t^3 - ay*bx*t^2 - 2*ax*by*t^2 - 2*bx*by*t - ax*cy*t - bx*cy)
      = 18*((ay*bx-ax*by)t^2 + (ay*cx - ay*cy)*t + by*cx - bx*cy)

The inflection points are found by calculating the roots of the quadratic equation above. But we only need to know if inflection points exist, which makes things a bit easier.

A quadratic equation with A*x^2 + Bx + C has at least one real root if the discriminant B^2-4*A*C is non-negative. Here the terms are A = ay*bx-ax*by, B = ay*cx-ax*cy, C = by*cx-bx*cy;

This leads to the following condition for the existance of one or two inflection points:

(ay*cx-ax*cy)^2 - 4*(ay*bx-ax*by)*(by*cx-bx*cy) >= 0

Finding a split point that is guaranteed to lie within the loop

In the original algorithm the point at t=0.5 was always used to split the curve. This worked for the vast majority of curves, but we found cases where the point was outside the loop.

We found by observation that whenever a curve has a self intersection, one of the velocity extrema of the curve always lies within the loop. Further investigation showed that the extremum, where the curvature has the highest value, always lies within the loop. We have no mathematical proof yet, but extensive testing indicates that this is always the case, even for very small loops.

The extrema of the curvature occur where the first and second derivative of the curve are perpendicular. This can be calculated by finding the roots of the dot product between the first and second derivatives.

Again, we use the approach by Adrian Colomitchi (http://www.caffeineowl.com/graphics/2d/vectorial/cubic-inflexion.html) and our simplification from above, but this time we calculate the dot product.

The simplified first and second derivatives are:

P' = 3*(a*t^2 + 2*b*t + c)
P" = 6*(a*t + b)

The dot product between the first and second derivative looks like this:

P'•P" = 3*(ax*t^2 + 2*bx*t + cx) * 6*(ax*t+bx) + 3*(ay*t^2 + 2*by*t + cy) * 6*(ay*t+by)
      = 18*(ax^2*t^3 + ax*bx*t^2 + 2*ax*bx*t^2 + 2*bx*bx*t + ax*cx*t + bx*cx) 
         + 18*(ay^2*t^3 + ay*by*t^2 + ay*cy*t + 2*ay*by^2*t + by*cy)
      = 18*((ax*ax  + ay*ay)*t^3 + 3*(ax*bx + ay*by)*t^2 + (2*bx*bx+2*by*by+ax*cx+ay*cy)*t + (bx*cx+by*cy))

In order to find the velocity extrema, we need to find the roots of the cubic term

(ax*ax  + ay*ay)*t^3 + 3*(ax*bx + ay*by)*t^2 + (2*bx*bx+2*by*by+ax*cx+ay*cy)*t + (bx*cx+by*cy)

There can be up to three roots. If more than one root exist, we use Curve.getCurvature() to calculate the curvature of each extremum and select the one with the highest curvature.

Performance

Applying the new condition is probably a little bit slower than the original condition, but the new condition is tighter, which means it results in fewer false positives. This means that we can stop looking for a self intersection earlier in many cases, which should compensate for the slightly higher effort needed to apply the new condition.

Finding the split point is now certainly slower than in the original algorithm, but this should not be a problem because the new condition for a self intersection filters out the vast majority of curves with no self intersection and self intersections on a curve seem to be quite rare in a real world scenario. The calculation the split point will only be necessary in very few cases, hence the performance impact should be negligible.

All in all performance should not suffer by implementing the proposed fix.

Note

The condition that the edge sum must have the opposite sign of the handles's side was found by observation and is not backed up by mathematical proof. We should watch this carefully to make sure that no false negatives are reported by the condition.

lehni commented 9 years ago

@iconexperience thank you for the impressive work on this, and taking the time to do such a detailed write-down!

It is moments like these that make me really happy to be working on open-source software. :smile:

I shall implement this new approach in the coming days! :+1: :+1: :+1:

iconexperience commented 9 years ago

@lehni I had the feeling that the solution is very good, but difficult to understand with the provided information. So I decided it would be worth taking some time and write a summary that we can link to from the code.

For this issue I will either write a new implementation of the solution and post it here or directly create a pull request. Maybe you can first look at the other boolean-related issues and see if anything can be closed. I am a bit surprised that we still have so many issues open, while I can hardly find any test cases that fail. Maybe we have more open issues that failing test cases :smile:

lehni commented 9 years ago

Yes that's the case for sure! The reason is that I'm still busy traveling, in Mexico City right now, back in the office in a week : ) But I also don't have this fix here in place yet, so I am getting more failing tests than you probably.

iconexperience commented 9 years ago

Just take your time, I am in no hurry at all. If we have a working boolean implementation by the end of the year, it would be great. I just kept posting examples and ideas because I could feel that you were in the zone.

lehni commented 9 years ago

I'd like to get there earlier, if possible. Until when do you think you'll have the PR ready for this issue? (no pressure, of course)

iconexperience commented 9 years ago

@lehni Thanks for merging. The change suggested in #784 is now the only difference between my codebase and the current boolean-fix branch. There are one or two glitches remaining, but I have to work really hard to provoke them.

lehni commented 9 years ago

I have just committed that change in #784. Now cleaning up code a bit. Which are the remaining glitches again?

iconexperience commented 9 years ago

One remaining glitch is Example 10 in #784.

lehni commented 9 years ago

Check the commit message in 12311535533a700cfd620f0531f7002e183e39a0 for some explanations of the clean-ups.

lehni commented 9 years ago

One question:

You write if there is no inflection point, the curve may have a self-intersection, and then we search for velocity extrema as potential split parameters. Could rootCount there ever be 0, leading to unexpected results since we would split at an undefined tSplit? Should we return locations; in case rootCount is 0?

lehni commented 9 years ago

I am also wondering if we should add these things as methods to curve:

Curve.hasInflectionPoint()
Curve.getVelocityExtrema()

etc.

iconexperience commented 9 years ago

Regarding first question, maybe this could happen if we search the interval 0...1 and the extremum is outside that interval. Anyway, making sure an extremum exists would not hurt I guess.

Regarding the second question: More useful would be Curve.getInflectionPoints(), but that would require more calculation. And I am not sure if anyone but us is really looking for velocity extrema. Maybe as a non-public function?

lehni commented 9 years ago

I guess we can leave it this way for now and then see what's required in both places once we start adding curve offsetting / stroke expansion functionality, right?

iconexperience commented 9 years ago

Right!

deanm commented 8 years ago

A note after having just came across this discussion (I'm not a paper.js user). There is an established projective geometry method of cubic Bézier classification (serpentine, cusp, loop... quadratic, line, point), covered in detail in Jim Blinn's Notation, Notation, Notation:

http://www.amazon.com/gp/product/1558608605

It is part of the pre-processing in the Loop Blinn GPU rasterization algorithm of cubic Bézier curves.

http://http.developer.nvidia.com/GPUGems3/gpugems3_ch25.html http://research.microsoft.com/en-us/um/people/cloop/LoopBlinn05.pdf

Be careful there are some errors in the GPU Gems article:

http://research.microsoft.com/en-us/um/people/cloop/ErrataGPUGems3.pdf

As covered in the above you can directly solve for the two t values at the point of self intersection by finding the two roots of a quadratic equation. This approach is more direct and avoids having to solve a cubic equation as is done in the current code.

A good reference is Kenneth Russell's webkit implementation of Loop Blinn:

https://bugs.webkit.org/show_bug.cgi?id=44729 https://bugs.webkit.org/attachment.cgi?id=65655&action=diff

Another is Skia's GPU backend:

https://skia.googlesource.com/skia.git/+/a1f4fa72303492e3ce496d9eae529dce6d592026/src/gpu/GrPathUtils.cpp#717 https://skia.googlesource.com/skia.git/+/a1f4fa72303492e3ce496d9eae529dce6d592026/src/core/SkGeometry.cpp#571

lehni commented 8 years ago

@deanm thanks a lot for writing this! @iconexperience would you like to have a look at this, and see how we can benefit from it?

lehni commented 7 years ago

@iconexperience check out the new code based on Loop and Blinn here, which largely simplified the handling of self-intersection: e7b53c8a227418c44c1afea3c6f5f7e1493b4ff2

It doesn't break any of my tests, but perhaps you should test against your code as well.

iconexperience commented 7 years ago

@lehni I have been following the discussion but had no time to contribute. Also, I was wandering what the whole point was until I saw this eye opener. I will test as soon as I can. Anyway, amazing stuff!

paperjs / paper.js