performance is worse than naive n**2?

smichr commented 6 years ago

I'm using the following settings and getting the following results when passing segments whose points are tuples of Python ints in the range of -100 to 100, all unique, no zero-length, vertical or horizontal segments:

USE_IGNORE_SEGMENT_ENDINGS = True

USE_DEBUG = False

USE_VERBOSE = False

# checks we should NOT need,
# but do them in case we find a test-case that fails.
USE_PARANOID = False

# Support vertical segments,
# (the bentley-ottmann method doesn't support this).
# We use the term 'START_VERTICAL' for a vertical segment,
# to differentiate it from START/END/INTERSECTION
USE_VERTICAL = False
# end defines!
# ------------

# ---------
# Constants
X, Y = 0, 1

# -----------------------------------------------------------------------------
# Switchable Number Implementation

NUMBER_TYPE = 'native'

When doubling the number of points, the times should at worst go by n**2

>>> from poly_point_isect import *
>>> n=16
>>> n*=2;s=segs(n);t=time();ok=isect_segments(s);n,time()-t
(32, 0.12201213836669922)
>>> n*=2;s=segs(n);t=time();ok=isect_segments(s);n,time()-t
(64, 0.5240523815155029)
>>> n*=2;s=segs(n);t=time();ok=isect_segments(s);n,time()-t
(128, 3.2543253898620605)
>>> n*=2;s=segs(n);t=time();ok=isect_segments(s);n,time()-t
(256, 12.6932692527771)
>>> n*=2;s=segs(n);t=time();ok=isect_segments(s);n,time()-t
(512, 60.36503577232361)

Here, by contrast, are the naive times:

>>> n=16
>>> n*=2;s=segs(n);t=time();ok=isect_segments__naive(s);n,time()-t
(32, 0.017001867294311523)
>>> n*=2;s=segs(n);t=time();ok=isect_segments__naive(s);n,time()-t
(64, 0.051004886627197266)
>>> n*=2;s=segs(n);t=time();ok=isect_segments__naive(s);n,time()-t
(128, 0.1380140781402588)
>>> n*=2;s=segs(n);t=time();ok=isect_segments__naive(s);n,time()-t
(256, 0.524052619934082)
>>> n*=2;s=segs(n);t=time();ok=isect_segments__naive(s);n,time()-t
(512, 2.076207399368286)
>>> n*=2;s=segs(n);t=time();ok=isect_segments__naive(s);n,time()-t
(1024, 8.344834566116333)

I confirmed that I am basically using the unmodified script; here is my diff wrt to that of the original:

diff --git a/poly_point_isect.py b/poly_point_isect.py
index e35216f..b614ecc 100644
--- a/poly_point_isect.py
+++ b/poly_point_isect.py
@@ -24,7 +24,7 @@
 # their end points form the intersection point.
 USE_IGNORE_SEGMENT_ENDINGS = True

-USE_DEBUG = False
+USE_DEBUG = True^M

 USE_VERBOSE = False

@@ -36,7 +36,7 @@
 # (the bentley-ottmann method doesn't support this).
 # We use the term 'START_VERTICAL' for a vertical segment,
 # to differentiate it from START/END/INTERSECTION
-USE_VERTICAL = False
+USE_VERTICAL = True^M
 # end defines!
 # ------------

@@ -321,7 +321,7 @@ def _sweep_to(self, p):

     def insert(self, event):
         assert(event not in self._events_current_sweep)
-        assert(not USE_VERTICAL or event.type != Event.Type.START_VERTICAL)
+        assert(event.type != Event.Type.START_VERTICAL)^M
         if USE_DEBUG:
             assert(event.in_sweep == False)
             assert(event.other.in_sweep == False)

ideasman42 commented 6 years ago

Yes, the performance isn't great, although the purpose of this is to be a reference implementation. I would assume at some point the sweepline would be faster than the naive method.

I don't see obvious room for speed improvement beyond moving some of the logic to C extensions. Undoubtedly there is some room for improvement - but I don't think there are significant bottlenecks in the current code (happy to be proven wrong here).
Did you test with very large sets of edges? (1024 isn't that many, how about doubling until you reach a million?)
Did you try with alternative Python implementations? - PyPy3 for example.

smichr commented 6 years ago

I got the same performance characteristic with the implementation at https://github.com/splichte/lsi:

>>> n*=2;s=segs(n);t=time();ok=intersection(s);n,time()-t
(32, 0.1099998950958252)
>>> n*=2;s=segs(n);t=time();ok=intersection(s);n,time()-t
(64, 0.30299997329711914)
>>> n*=2;s=segs(n);t=time();ok=intersection(s);n,time()-t
(128, 1.754000186920166)
>>> n*=2;s=segs(n);t=time();ok=intersection(s);n,time()-t
(256, 6.108999729156494)
>>> n*=2;s=segs(n);t=time();ok=intersection(s);n,time()-t
(512, 34.04099988937378)
>>> n*=2;s=segs(n);t=time();ok=intersection(s);n,time()-t
(1024, 156.44429993629456)

ideasman42 commented 6 years ago

@smichr interesting to know this isn't limited to this implementation. Could you check the other things I mentioned?

Also the function call segs isn't included. It might be easiest if you could fork the repo and commit your test to a branch.

smichr commented 6 years ago

I don't think my system supports SSE2 needed for pypy.

def segs(n, hv=False, mag=100):
    """Return n unique segments ((x1,y1),(x2,y2)). If hv is False
    then neither horizontal nor vertical lines will be returned.
    """
    from random import randint
    r = lambda : (randint(-mag,mag),randint(-mag,mag))
    rv = set()
    while len(rv) != n:
        s = (r(), r())
        if not hv:
            if s[1][0] == s[0][0]:
                continue
            if s[1][1] == s[0][1]:
                continue
        if s[0] > s[1]:
            s = (s[1], s[0])
        rv.add(s)
    return list(rv)

I haven't been able to get anywhere near 10**6 in less than an hour. I'm not usually dealing with such large data sets, either. Someone else with a faster system will have to test that.

ideasman42 commented 6 years ago

Using this script in tests/ this is an example where performance is better than naive n**2.

USE_NAIVE=1 SEG=32000 pypy3 tests_perf.py

Reports: 32000 38.48890995979309 for function isect_segments__naive

USE_NAIVE=0 SEG=32000 pypy3 tests_perf.py

Reports: 32000 18.43715000152588 for function isect_segments

An issue with the naive n**2 test is you're testing many overlapping lines. This causes a lot of book-keeping for the sweep-line (adding/removing events, updating the tree), with this implementation the book-keeping in Python slows down to the point where naive n**2 might be faster, but this isn't always the case.

In the following test I changed the line length to be 1/20th of the magnitude. This might be more/less similar to your use-case. Both tests are valid - often lines will be derived from maps/graphics/pointer input - so random large lines isn't necessarily such a useful test-case.

Closing because I don't think there is an error in the code causing the slow-down, and there are common cases where performance is better.

"""
Exanple use:
   USE_NAIVE=0 SEG=32000 python3 tests_perf.py
"""
import os
import sys

def segs(
    n,
    allow_axis_aligned=False,
    allow_overlap=False,
    mag=500, seed=42,
    length=(1 / 20)
):
    """
    Return n unique segments ((x1, y1), (x2, y2)).
    """
    import random
    rng = random.Random(seed)
    randint = rng.randint

    def r():
        return (
            (randint(-mag, mag)) / mag,
            (randint(-mag, mag)) / mag,
        )
    rv = set()
    if not allow_overlap:
        sv = set()
    while len(rv) != n:
        v0 = r()
        v1 = r()
        v1 = (
            v0[0] + (v1[0] * length),
            v0[1] + (v1[1] * length),
        )
        s = (v0, v1)
        if s[0] > s[1]:
            s = (s[1], s[0])
        if not allow_axis_aligned:
            if s[1][0] == s[0][0]:
                continue
            if s[1][1] == s[0][1]:
                continue
        if not allow_overlap:
            # Cheap way to avoid exact overlaps
            slope = (s[0][0] - s[1][0]) / (s[0][1] - s[1][1])
            if slope in sv:
                continue
            sv.add(slope)

        rv.add(s)
    return list(sorted(rv))

POLY_ISECT_MODULE_PATH = os.path.join(os.path.dirname(__file__), "..")
sys.path.append(POLY_ISECT_MODULE_PATH)

n = int(os.environ.get("SEG"))
if os.environ.get("USE_NAIVE") == "1":
    from poly_point_isect import isect_segments__naive as isect_segments
else:
    from poly_point_isect import isect_segments

def time_segs(n):
    from time import time
    s = segs(n)
    t = time()
    ok = isect_segments(s)
    return time() - t

print(n, time_segs(n), "for function", isect_segments.__name__)

ideasman42 / isect_segments-bentley_ottmann

performance is worse than naive n**2? #17