path.unite() fails in some cases

iconexperience commented 9 years ago

In the boolean-fix branch the function path.unite() fails in some cases.

These two paths have tiny loops that are known to cause problems. In both cases an error message "Boolean operation results in open path..." is thrown.

var p1 = new Path({segments:[[355.79891796422964, 236.17020542681888, 0, 0, 1.2439784539487277, 1.4794685581651095], [356.0296428908839, 236.9914994785681, 1.4909458058485257, 0.09611301344671119, 0, 0], [359.8894985563501, 177.1157822449368, 0, 0, 14.905409904819521, 0.9608691724346841], [401.72247951466045, 197.55639320212669, -11.035499609302576, -13.124563889174567, 0, 0]], closed:true});
var p2 = new Path({segments:[[476.23357485776245, 441.43380183736406, 0, 0, -14.035779776514971, -7.667134722232049], [457.0850487375601, 438.06293818451496, -15.385443240677716, 4.208272221191237, 0, 0], [472.9149512624399, 495.93706181548504, 0, 0, 11.255760483999268, -3.0787091039380243], [447.46991109391223, 494.08977630256766, 14.201355067750626, 7.757581286996526, 0, 0]], closed:true});

In the third case a self intersection is found at a point where there is definitely no self intersection. When calling unite(), a self intersection is actually created. An error message is not thrown.

var p3 = new Path({segments:[[422.30476273056735, 487.15231707514175, 0, 0, -3.606466065425707, 25.937938843432278], [412.7282016503929, 488.89813033672453, 12.1969877642685, 9.81953660801696, 0, 0], [375.10194308268757, 535.6342477000248, 0, 0, -21.89319225994592, -17.625745554380842], [362.8764678534364, 478.8892808295014, -0.1046615146468639, 0.7527324302860166, 0, 0]], closed:true});

The third case should be a problem with the getIntersections() function, not really with the boolean operations.

iconexperience commented 9 years ago

@lehni I tried to understand what is going on in the third case and noticed that you seem to invoke the addOverlap() function even if both curves are one and the same, e.g. if we are checking for self intersections on a curve. Is this correct?

iconexperience commented 9 years ago

Hmm, this is getting interesting.

Do you know that the way self-intersections on curves are found in the code is not reliable? Currently the curve is subdivided at t=0.5 and then the two parts are tested for intersections. It is assumed that for curves with loops, t=0.5 is inside the loop. But there are actually cases where this is not true. Then the self intersection is still on one part of the curve and it may not be found. These cases are very rare and I am not sure if this causes any problems, but this is something I thought I should share.

Also, I think you should consider testing for overlapping bounds at the beginning of the Curve.getIntersections() function, because in a complex path the bounds of two curves will not touch in most cases and doing the addOverlap() does not look cheap. Here is how it can be done:

if (!(Math.max(v1[0], v1[2], v1[4], v1[6]) >=  Math.min(v2[0], v2[2], v2[4], v2[6]) &&
    Math.max(v1[1], v1[3], v1[5], v1[7]) >=  Math.min(v2[1], v2[3], v2[5], v2[7]) &&
    Math.min(v1[0], v1[2], v1[4], v1[6]) <=  Math.max(v2[0], v2[2], v2[4], v2[6]) &&
    Math.min(v1[1], v1[3], v1[5], v1[7]) <=  Math.max(v2[1], v2[3], v2[5], v2[7])))  {
  return locations;
}

I think this should improve performance quite a bit.

iconexperience commented 9 years ago

I have boiled down the thrid case to a simple path that consist of only one curve:

var p1 = new Path({segments:[[422.30476273056735, 487.15231707514175, 0, 0, -3.606466065425707, 25.937938843432278], [412.7282016503929, 488.89813033672453, 12.1969877642685, 9.81953660801696, 0, 0], [375.10194308268757, 535.6342477000248, 0, 0, -21.89319225994592, -17.625745554380842], [362.8764678534364, 478.8892808295014, -0.1046615146468639, 0.7527324302860166, 0, 0]], closed:true});

Currently the code finds a self intersection in this path.

iconexperience commented 9 years ago

And here is the solution for the third case: If a path contains a curve of which the handle vectors (or negative handle vectors) cross, the curve may have a self intersection. Our algorithm subdivides these the curves at t = 0.5 (which is not entirely correct, see my comment above) and the two curve parts are checked if they intersect each other using curve.getIntersections(). This always returns one intersection, because the curve parts touch at the end point of the first curve, which is identical to the start point of the second curve. In cases where there really is a self intersection, a second intersection is found (which is correct). But the first intersection even gets modified (t2 is set to 0.5), which makes things even weirder.

To solve this we should add some functionality to PathItem._getIntersections() line 111. There, locations with t1 = 1 and t2 = 0 should be declined.

@lehni: Do you know if this is a new bug, or has this always happened?

Update: The solution described above is only a workaround. The real solution can be found below.

lehni commented 9 years ago

@iconexperience this is indeed a bug. I don't have time right now to look into it all, but will hopefully find some in the coming days. My guts tell me it's caused by these checks that were added later, actually based on issues you submitted a while back :)

// Handle a special case where if both curves start or end at the
// same point, the same end-point case will be handled after we
// calculate other intersections within the curve.
if (c1p1.isClose(c2p1, tolerance))
    addLocation(locations, include, c1, 0, c1p1, c2, 0, c1p1);
if (c1p1.isClose(c2p2, tolerance))
    addLocation(locations, include, c1, 0, c1p1, c2, 1, c1p1);

Thanks for the thorough investigation!

lehni commented 9 years ago

In fact, this brings back some questions that @hkrish mentioned in the past, relating to the closed #565 and #570:

getIntersections() will need to get smarter about what it returns. In boolean operations, we only really want crossings...

iconexperience commented 9 years ago

@lehni Or we could distinguish touching and crossing locations.

I think the bug is really in addCurveIntersections(), but certainly t1 and t2 get exchanged somewhere. Will keep investigating.

iconexperience commented 9 years ago

@lehni You where right, the problem is caused by Curve.getIntersections(). We are simply using the end points of parameters c1 and c2 to check whether the curves touch. But when searching for self intersections this will not work, because the values v1 and v2 desribe only half of the curve each. Therefore, for checking if the curves touch we must use v1[0],v1[1], v2[0],v2[1], v1[6],v1[7] and v2[6],v2[7]. This will solve the issue.

It seems like the function Point.isClose() was only introduced for this functionality. Not sure if it should be kept or be replaced by something that takes x1, y1, x2, y2 as parameters.

lehni commented 9 years ago

@iconexperience I have a rough idea what needs to be done here, but need to wait some more days to give it full focus. We'll solve this one soon, promised!

lehni commented 9 years ago

@iconexperience regarding splitting at t = 0.5 not being the right thing to do: @kuribas wrote this comment about it: https://github.com/paperjs/paper.js/issues/371#issuecomment-136643109 We should try approach as well.

iconexperience commented 9 years ago

@lehni The solutions is really simple, the main issue is deciding what to do with the API.

Regarding splitting at t=0.5 for finding self intersections: In my offset example you can see that splitting at the "peak" will always work for finding self intersections. The problem is that I do not know why :smile:

I calculate these "peaks" by finding the roots of the dot product of the first and second derivative, which means at these points the vectors of the first and second derivative are perpendicular. I never found anything in literature about these points, but I am sure I am not the first to use them. Maybe someone with deeper knowledge of bezier curves like @Pomax or @kuribas can shed some light on this mystery and explain why a "peak" is always within the loop of a self intersecting curve.

But I do not think this is an important issue, because the cases where using t=0.5 fails are very rare. I intend to open a new issue for this with an example where t=0.5 fails, only to have it documented. We should not spend too much time at this at the moment.

lehni commented 9 years ago

I started working on this now. The speed increase thanks to the control bounds check is massive!

iconexperience commented 9 years ago

@lehni I think (not sure) that you could even use 0.8 (or even 0.78) times the handle's length for calculating the control bounds and the curve would still be guaranteed to lie completely within these bounds. This would give an extra performance improvement, but only a small amount. In any case we should keep this idea in mind, because it would be so easy to implement.

kuribas commented 9 years ago

@iconexperience If you look at the hodograph (first derivative curve), you can see that it has minima and maxima when it is perpendicular to it's derivative (second derivative). So what you are doing is basicly finding maxima and minima of the ~~first derivative~~ velocity.

lehni commented 9 years ago

@kuribas thanks for the insight! I really appreciate you helping us out with these things.

@iconexperience I've found a fix for this issue, and will shortly close it. Please create a new issue regarding the wrong t = 0.5 splitting, with the mentioned path data.

lehni commented 9 years ago

@iconexperience would love to hear how your tests are working now, as I think except t=0.5 splitting, all known issues are fixed now!

iconexperience commented 9 years ago

@lehni I have created a new issue for the self intersection assumtions now. I am planning to add more information and suggested solutions soon.

Regarding the other issues: I am still getting the "results in open path" message in some cases. I will isolate one case and open a new issue for that.

lehni commented 9 years ago

Sweet, thanks! Are there many of those left?

lehni commented 9 years ago

@iconexperience above you say that we can scale the handles used to create the bounds bounds by 0.8 or even 0.78. I was curious about this and wrote a little script that runs random curves and determines the largest factor observed with actual random curves, until it is stopped, with a nice surprise: The factor appears to stabilize just below 0.75 (0.74952227304554 last time i run it). Let's implement this! I shall also use it for roughBounds...

function randomCurve() {
    var size = 100000, // view.size,
        p1 = Point.random() * size,
        c1 = Point.random() * size,
        c2 = Point.random() * size,
        p2 = Point.random() * size;
    var curve = new Curve(p1, c1 - p1, c2 - p2, p2),
        bounds = curve.bounds,
        controlBounds = curve.handleBounds;
    var tl = Point.min(p1, p2),
        br = Point.max(p1, p2);
    var btl = bounds.topLeft - tl,
        bbr = bounds.bottomRight - br,
        ctl = controlBounds.topLeft - tl,
        cbr = controlBounds.bottomRight - br,
        ftl = btl / ctl,
        fbr = bbr / cbr;
    return Math.max(ftl.x || 0, ftl.y || 0, fbr.x || 0, fbr.y || 0);
}
var factor = 0;

var text = new PointText(view.center);
view.onFrame = function() {
    for (var i = 0; i < 1000; i++)
        factor = Math.max(randomCurve(), factor);
    text.content = factor;
}

iconexperience commented 9 years ago

@lehni Now this is strange. The initial curve that I am using for my self intersection algo has a factor of exactly 0.75. AFAIR I selected the points by hand, just to have something that looks alright. And this curve just happens to be the curve with the highest factor of all?

var path1 = new Path({segments:[[200, 300, 0, 0, 300, -100], [400, 300, -300, -100, 0, 0]]});

lehni commented 9 years ago

Yeah that is strange indeed! Now the question is: can you find one where it's more ? But I couldn't actually measure any speed increase in my tests so it might not matter much... What I mean is, we might just stay on the save side and not scale at all?

iconexperience commented 9 years ago

Actually I did not expect much of an increase, because for many curves it makes no difference at all. I will investigate a little more and see if I can find something else.

If you want to see more performance improvements, you may want to handle cases, where two curves only touch at an end point, separately. If you have a complex path, many of the cases where curve handle bounds touch are actually two adjacent curves of a path. And if the curves are simple and only touch at the end point, it should be possible to handle these cases much faster.

But do we have a performance problem at all?

lehni commented 9 years ago

The addition of resolving of self-intersections added quite a penalty to some examples... The biggest slow-down happened in the CompoundPaths 3 ! example in the examples/Scripts/BooleanOperations.html file. When you see the shape you understand why. I used this one to check performance impact of different approaches. Look at the console and you'll see the times logged.

It would be great to shave some time off that again! Although we're probably still faster now than before we had any bounding box checks on the curves.

paperjs / paper.js

path.unite() fails in some cases #765