Faster Approximate Intersections Code

tatarize commented 1 year ago

I needed intersections code from this project, and beyond being a bit slow, it also has some issues with being a bit in accurate and not dealing with things like arcs at times. @Esser50K has a PR trying to speed this up with an acceleration structure, I have another to avoid the initial call altogether if their quick-bounds don't overlap. This might be used instead of the current code. See #188 as an attempt to help this, which I would assume is actually likely to be moot.

import numpy as np

def find_intersections(
    segment1,
    segment2,
    samples=50,
    ta=(0.0, 1.0, None),
    tb=(0.0, 1.0, None),
    depth=0,
    enhancements=2,
    enhance_samples=50,
):
    """
    Calculate intersections by linearized polyline intersections with enhancements.
    We calculate probable intersections by linearizing our segment into `sample` polylines
    we then find those intersecting segments and the range of t where those intersections
    could have occurred and then subdivide those segments in a series of enhancements to
    find their intersections with increased precision.

    This code is fast, but it could fail by both finding a rare phantom intersection (if there
    is a low or no enhancements) or by failing to find a real intersection. Because the polylines
    approximation did not intersect in the base case.

    At a resolution of about 1e-15 the intersection calculations become unstable and intersection
    candidates can duplicate or become lost. We terminate at that point and give the last best
    guess.

    :param segment1:
    :param segment2:
    :param samples:
    :param ta:
    :param tb:
    :param depth:
    :param enhancements:
    :param enhance_samples:
    :return:
    """
    if depth == 0:
        # Quick Fail. There are no intersections without overlapping quick bounds
        try:
            s1x = [e.real for e in segment1.bpoints()]
            s2x = [e.real for e in segment2.bpoints()]
            if min(s1x) > max(s2x) or max(s1x) < min(s2x):
                return
            s1y = [e.imag for e in segment1.bpoints()]
            s2y = [e.imag for e in segment2.bpoints()]
            if min(s1y) > max(s2y) or max(s1y) < min(s2y):
                return
        except AttributeError:
            pass
    assert(samples >= 2)
    a = np.linspace(ta[0], ta[1], num=samples)
    b = np.linspace(tb[0], tb[1], num=samples)
    step_a = a[1] - a[0]
    step_b = b[1] - b[0]
    j = segment1.points(a)
    k = segment2.points(b)

    ax1, bx1 = np.meshgrid(np.real(j[:-1]), np.real(k[:-1]))
    ax2, bx2 = np.meshgrid(np.real(j[1:]), np.real(k[1:]))
    ay1, by1 = np.meshgrid(np.imag(j[:-1]), np.imag(k[:-1]))
    ay2, by2 = np.meshgrid(np.imag(j[1:]), np.imag(k[1:]))

    denom = (by2 - by1) * (ax2 - ax1) - (bx2 - bx1) * (ay2 - ay1)
    qa = (bx2 - bx1) * (ay1 - by1) - (by2 - by1) * (ax1 - bx1)
    qb = (ax2 - ax1) * (ay1 - by1) - (ay2 - ay1) * (ax1 - bx1)
    hits = np.dstack(
        (
            denom != 0,  # Cannot be parallel.
            np.sign(denom) == np.sign(qa),  # D and Qa must have same sign.
            np.sign(denom) == np.sign(qb),  # D and Qb must have same sign.
            abs(denom) >= abs(qa),  # D >= Qa (else not between 0 - 1)
            abs(denom) >= abs(qb),  # D >= Qb (else not between 0 - 1)
        )
    ).all(axis=2)

    where_hit = np.argwhere(hits)
    if len(where_hit) != 1 and step_a < 1e-10:
        # We're hits are becoming unstable give last best value.
        if ta[2] is not None and tb[2] is not None:
            yield ta[2], tb[2]
        return

    # Calculate the t values for the intersections
    ta_hit = qa[hits] / denom[hits]
    tb_hit = qb[hits] / denom[hits]

    for i, hit in enumerate(where_hit):

        at = ta[0] + float(hit[1]) * step_a  # Zoomed min+segment intersected.
        bt = tb[0] + float(hit[0]) * step_b
        a_fractional = ta_hit[i] * step_a  # Fractional guess within intersected segment
        b_fractional = tb_hit[i] * step_b
        if depth == enhancements:
            # We've enhanced as best as we can, yield the current + segment t-value to our answer
            yield at + a_fractional, bt + b_fractional
        else:
            yield from find_intersections(
                segment1,
                segment2,
                ta=(at, at + step_a, at + a_fractional),
                tb=(bt, bt + step_b, bt + b_fractional),
                samples=enhance_samples,
                depth=depth+1,
                enhancements=enhancements,
                enhance_samples=enhance_samples,
            )

Example code to test against:

import timeit
from random import random
import svgpathtools as pt

total_new = 0
total_original = 0
total_new_error = 0
total_new_error_events = 0
total_original_error = 0
total_original_error_events = 0

for i in range(20):
    q1 = pt.QuadraticBezier(
        complex(random() * 5, random() * 5),
        complex(random() * 5, random() * 5),
        complex(random() * 5, random() * 5),
    )

    q2 = pt.QuadraticBezier(
        complex(random() * 5, random() * 5),
        complex(random() * 5, random() * 5),
        complex(random() * 5, random() * 5),
    )
    original = lambda s1, s2: list(q1.intersect(q2))
    new = lambda s1, s2: list(find_intersections(q1, q2))

    intersects_new = new(q1, q2)

    for i in range(len(intersects_new)):
        inters = intersects_new[i]
        a1 = q1.points(inters[0])
        a2 = q2.points(inters[1])
        total_new_error += abs(a1 - a2)
        total_new_error_events += 1

    intersect_original = original(q1, q2)
    for i in range(len(intersect_original)):
        ic = intersect_original[i]
        b1 = q1.points(ic[0])
        b2 = q2.points(ic[1])
        total_original_error += abs(b1 - b2)
        total_original_error_events += 1

    if len(intersect_original) != len(intersects_new):
        print(f"Curves: {q1} and {q2}")
        print(f"Intersection mismatch: {len(intersect_original)} vs. {len(intersects_new)}")
        print(f"original gave: {intersect_original}\nnew gave:{intersects_new}")

    time_new = timeit.timeit(lambda: new(q1, q2), number=10)
    time_original = timeit.timeit(lambda: original(q1, q2), number=10)
    print(f"{time_original:.5f}(original) vs {time_new:.5f}(new). Improvement: {time_original - time_new: .5f} ({100 * ((time_original / time_new) - 1):.2f}% speedup)\n")
    total_new += time_new
    total_original += time_original
print(f"{total_new}(new) vs {total_original}(original) improvement over best: {total_original - total_new} \n({100 * ((total_original / total_new) - 1):.2f}% speedup)\n")
print(f"{total_new_error / total_new_error_events}(new) error vs {total_original_error / total_original_error_events}(original) for {total_new_error_events}")

It's random so you'll get some variance but the output here:

1.68379(original) vs 0.03082(new). Improvement:  1.65297 (5364.08% speedup)

1.18223(original) vs 0.03304(new). Improvement:  1.14919 (3477.86% speedup)

1.66896(original) vs 0.02952(new). Improvement:  1.63944 (5554.32% speedup)

0.01047(original) vs 0.00840(new). Improvement:  0.00207 (24.60% speedup)

0.01330(original) vs 0.01086(new). Improvement:  0.00244 (22.44% speedup)

0.01234(original) vs 0.00014(new). Improvement:  0.01220 (8689.56% speedup)

1.67768(original) vs 0.02988(new). Improvement:  1.64780 (5513.92% speedup)

0.05884(original) vs 0.01006(new). Improvement:  0.04879 (485.18% speedup)

Curves: QuadraticBezier(start=(3.5484553804985293+2.5046182922009166j), control=(1.3325492938804107+4.344642999744619j), end=(0.35391834838510383+2.292041420542328j)) and QuadraticBezier(start=(1.9392172436798112+1.6860910957524489j), control=(3.2983471786893386+1.3762871504661716j), end=(2.2024290337711045+3.6737856807400755j))
Intersection mismatch: 2 vs. 1
original gave: [(0.2809981107711792, 0.897783637046814), (0.2809983491897583, 0.8977838754653931)]
new gave:[(0.2809981313489884, 0.8977837239721281)]
2.10376(original) vs 0.02854(new). Improvement:  2.07522 (7270.96% speedup)

0.21166(original) vs 0.00849(new). Improvement:  0.20318 (2394.23% speedup)

1.28261(original) vs 0.02484(new). Improvement:  1.25777 (5062.58% speedup)

0.49682(original) vs 0.00956(new). Improvement:  0.48727 (5098.21% speedup)

0.01001(original) vs 0.00900(new). Improvement:  0.00102 (11.32% speedup)

0.13463(original) vs 0.00774(new). Improvement:  0.12688 (1638.67% speedup)

0.01282(original) vs 0.00888(new). Improvement:  0.00394 (44.39% speedup)

0.01270(original) vs 0.01108(new). Improvement:  0.00162 (14.59% speedup)

1.94574(original) vs 0.03266(new). Improvement:  1.91308 (5857.29% speedup)

1.86026(original) vs 0.03188(new). Improvement:  1.82838 (5734.93% speedup)

0.05301(original) vs 0.00831(new). Improvement:  0.04470 (537.68% speedup)

4.83928(original) vs 0.04242(new). Improvement:  4.79686 (11307.13% speedup)

0.376131321000007(new) vs 19.270938381999994(original) improvement over best: 18.894807060999987 
(5023.46% speedup)

8.93393079577719e-11(new) error vs 4.5171079534892864e-07(original) for 10

So, on average, the error for the new routine is ~1e-11 (with default settings, you can get up to 1e-15) which is much less than ~1e-7 of the current routine (for quad quad in my test).

The algorithm itself is a bit easy. I cut both segments into 50 segment lines. I make a 50x50 mesh of all the segments from both pieces and I calculate if they intercept anywhere. Then I take that 1/100th of the segment and I recursively do this again but within that 1/100th of a segment. We then perform a set number of these enhancements on the found point. The final routine call adds the tiny fractional amount within the last field-of-view. We don't create new shapes we just create some lines and make them smaller and smaller.

The current routine still finds some extra phantom points, this one shouldn't find any phantom point except with low enhancements and the linearization gets a false positive. It could also get some false negatives, if the polyline approximation doesn't have an intersection.

In the standard case this would entirely be able to find all the intersections between an arc and a n-dimensional bezier curve etc. We're just linearizing it and zooming in repeatedly, which should would work for any segments.

The test had it 50x faster.

tatarize commented 1 year ago

Not only is it faster and more accurate (in most cases) but it does the arc intersections. Though if you merely remove the block against arc-arc checks in the test_intersect it will fail but not because it's wrong but because the rotated arc actually does intersect in two positions.

svgviewer-output

tatarize commented 8 months ago

I used this fast code internally on a heavy hitting set of tests and there were a rare issues where this code would fail.

Basically since it divides the initial curve into about 50 segments, and if any of those segments intersect it zooms in on that segment and does 50 sub-segments within that 1/50th segment and with numpy acceleration goes lightning fast. The typical error in a typical curve is about 1e-5 or so, and we deal with t values near 1e-3. But, if the intersection occurs so very near one of the 50 segment breaks but actually occurs in a different segment it can false negative a clear intersection.

So it hits on segment 30 at t=0.999995 but the error is such that it's actually at segment 31 at t=0.000004 the big slop in the error can search segment 30. Fail to find the intersection and default out declaring the first hit was due to a false positive, which actually results in a false negative.

The scope recursion:

        for i, hit in enumerate(where_hit):
            # Zoomed min+segment intersected.
            # Fractional guess within intersected segment
            at_guess = ta[0] + (hit[1] + ta_hit[i]) * step_a
            bt_guess = tb[0] + (hit[0] + tb_hit[i]) * step_b

            if depth == enhancements:
                # We've enhanced as best as we can, yield the current + segment t-value to our answer
                yield at_guess, bt_guess
            else:
                yield from self._find_intersections_intercept(
                    segment1,
                    segment2,
                    fun1,
                    fun2,
                    ta=(at_guess - step_a/2, at_guess + step_a/2, at_guess),
                    tb=(bt_guess - step_b/2, bt_guess + step_b/2, bt_guess),
                    samples=enhance_samples,
                    depth=depth + 1,
                    enhancements=enhancements,
                    enhance_samples=enhance_samples,
                )

Fixes the issue, generally by going by the full fractional hint ±step/2.

andreacar commented 7 months ago

@tatarize, fantastic contribution to enhancing intersection speed! The current implementation is effective for the QuadraticBezier and Line classes. However, a limitation arises as the .points() method is absent for the Path class. Appreciate your efforts!

def find_intersections(
    segment1,
    segment2,
    samples=50,
    ta=(0.0, 1.0, None),
    tb=(0.0, 1.0, None),
    depth=0,
    enhancements=2,
    enhance_samples=50,
):
    if depth == 0:
        # Quick Fail. There are no intersections without overlapping quick bounds
        try:
            s1x = [e.real for e in segment1.bpoints()]
            s2x = [e.real for e in segment2.bpoints()]
            if min(s1x) > max(s2x) or max(s1x) < min(s2x):
                return
            s1y = [e.imag for e in segment1.bpoints()]
            s2y = [e.imag for e in segment2.bpoints()]
            if min(s1y) > max(s2y) or max(s1y) < min(s2y):
                return
        except AttributeError:
            pass
    assert(samples >= 2)
    a = np.linspace(ta[0], ta[1], num=samples)
    b = np.linspace(tb[0], tb[1], num=samples)
    step_a = a[1] - a[0]
    step_b = b[1] - b[0]
    # changes from the original code
    j = [segment1.point(t) for t in a]
    k = [segment2.point(t) for t in b]
    ax1, bx1 = np.meshgrid(np.real(j[:-1]), np.real(k[:-1]))
    ax2, bx2 = np.meshgrid(np.real(j[1:]), np.real(k[1:]))
    ay1, by1 = np.meshgrid(np.imag(j[:-1]), np.imag(k[:-1]))
    ay2, by2 = np.meshgrid(np.imag(j[1:]), np.imag(k[1:]))
    denom = (by2 - by1) * (ax2 - ax1) - (bx2 - bx1) * (ay2 - ay1)
    qa = (bx2 - bx1) * (ay1 - by1) - (by2 - by1) * (ax1 - bx1)
    qb = (ax2 - ax1) * (ay1 - by1) - (ay2 - ay1) * (ax1 - bx1)
    hits = np.dstack(
        (
            denom != 0,  # Cannot be parallel.
            np.sign(denom) == np.sign(qa),  # D and Qa must have same sign.
            np.sign(denom) == np.sign(qb),  # D and Qb must have same sign.
            abs(denom) >= abs(qa),  # D >= Qa (else not between 0 - 1)
            abs(denom) >= abs(qb),  # D >= Qb (else not between 0 - 1)
        )
    ).all(axis=2)

    where_hit = np.argwhere(hits)
    if len(where_hit) != 1 and step_a < 1e-10:
        # We're hits are becoming unstable give last best value.
        if ta[2] is not None and tb[2] is not None:
            yield ta[2], tb[2]
        return

    # Calculate the t values for the intersections
    ta_hit = qa[hits] / denom[hits]
    tb_hit = qb[hits] / denom[hits]

    for i, hit in enumerate(where_hit):

        at = ta[0] + float(hit[1]) * step_a  # Zoomed min+segment intersected.
        bt = tb[0] + float(hit[0]) * step_b
        a_fractional = ta_hit[i] * step_a  # Fractional guess within intersected segment
        b_fractional = tb_hit[i] * step_b
        if depth == enhancements:
            # We've enhanced as best as we can, yield the current + segment t-value to our answer
            yield at + a_fractional, bt + b_fractional
        else:
            yield from find_intersections(
                segment1,
                segment2,
                ta=(at, at + step_a, at + a_fractional),
                tb=(bt, bt + step_b, bt + b_fractional),
                samples=enhance_samples,
                depth=depth+1,
                enhancements=enhancements,
                enhance_samples=enhance_samples,
            )

tatarize commented 7 months ago

The code only really does each individual intersection between segments. Depending on the curve the 50x50 subsegments should probably cross generalization might not work. Rather than at max order 2 the entire path could be order n. It can absolutely self-intersect any number of times. And you'd still be best to either brute force all the intersections or use a bit of a more extreme data structure like Esser50K was trying to do (to bring it less than O(N²). You'd probably just use this to check each segment within each path against each other and the fast-fail and fast operation would cover most parts.

I actually use this code and similar bits of code elsewhere and decisions as to which segments to check, rather than the path as a whole, are a lot easier depending on individual use case. Sometimes you care about self-intersections sometimes you don't care about such things. Sometimes you're dealing with lines and there's much easier methods.

mathandy / svgpathtools

Faster Approximate Intersections Code #192