JuliaGeometry / GeometryBasics.jl

Basic Geometry Types
MIT License
165 stars 54 forks source link

Fix #168 #185

Closed cmcaine closed 6 months ago

cmcaine commented 1 year ago

Replaces the intersection algorithm with one that seems more reliable.

See here for the algorithm and explanation: http://paulbourke.net/geometry/pointlineplane/

This algorithm is also used in Turf.js, Turf.jl, etc.

I did a bunch of testing with other algorithms and even tested some algorithms from ChatGPT. You can see some of the testing here: https://gist.github.com/cmcaine/80600c1fd32a6ac3822f2fac4ab93367

cmcaine commented 6 months ago

Thanks for the review! I probably benchmarked it at the time but I've forgotten now.

It's a much simpler algorithm and I would be very surprised if it was slower than the existing code.

For me, the more important thing is that this is a well-known and peer-reviewed algorithm and the current implementation is not (perhaps it's a published algorithm, but I'm not familiar with it and it's provenance isn't noted in the code). The Paul Bourke algorithm I've used also seems to give better quality results, as evidenced by the tests I added, etc.

cmcaine commented 6 months ago

Oh, there are benchmarks and a bunch of code evaluating the accuracy of various line intersection algorithms in my gist that I linked in my first post.

https://gist.github.com/cmcaine/80600c1fd32a6ac3822f2fac4ab93367

rafaqz commented 6 months ago

I mean, posting the @btime output here is the path to getting this merged, no-one will read your link.

I dont have commit access here either.

cmcaine commented 6 months ago

I'm not gonna rerun this stuff for a package I don't use any more and a PR that I don't believe is ever going to be merged. 🤷

The short of it is that the algorithm in this PR is better known, better tested, seems to produce better output and (least importantly, imo) is almost certainly faster.

Simon can take it or leave it. I appreciate that they're spread thin.

rafaqz commented 6 months ago

No worries! I use GeometryOps.jl for this ;)

It would just be good to not have correctness issues around and proving inline in comments that a PR is all upside no downside is a pretty reliable way to get it merged by people with zero spare time.

knuesel commented 6 months ago

I've made an updated test script based on @cmcaine's gist. Here are the results with some comments.

Precision of results

(In cases where both methods find an intersection)

The paul_bourke method proposed here is either close to intersects or better:

For 20000 cases with values in [-1,1]:
    paul_bourke:    MSE 2.4022857873839985e-32, median abs. err. 6.451233687564108e-17
    intersects: MSE 1.9473991566481637e-32, median abs. err. 5.542123670858672e-17

For 20000 cases with values on unit circle:
    paul_bourke:    MSE 2.666382940793299e-32,  median abs. err. 7.412702994822002e-17
    intersects: MSE 5.532644758280307e-31,  median abs. err. 8.723457354663373e-17

For 20000 cases with values in [-1e6,1e6]:
    paul_bourke:    MSE 4.862985741725941e-21,  median abs. err. 3.753758592593313e-11
    intersects: MSE 5.853725916879825e-21,  median abs. err. 3.3163115886529904e-11

For 20000 cases with values in [-1e12,1e12]:
    paul_bourke:    MSE 1.5563052435083193e-8,  median abs. err. 5.495253849314317e-5
    intersects: MSE 1.5072897775558596e-8,  median abs. err. 5.050624002544535e-5

For 20000 cases with values on circle of radius 1e12:
    paul_bourke:    MSE 3.671959108333531e-8,   median abs. err. 7.441170276538788e-5
    intersects: MSE 9.289543385192281e-8,   median abs. err. 9.308651052489074e-5

Finding an intersection when there is one indeed

The paul_bourke method is ~4.5 times more likely to find an intersection where intersects fails, than the other way around:

For 100000 cases with values in whole Float64 space:
paul_bourke wins 1743, intersects wins 392

Benchmarks

Using random lines in the whole Float64 space, where 4.3% of cases did contain an intersection, paul_bourke was about 3 times faster than intersects:

Benchmarking `paul_bourke` using values in whole Float64 space with 4.3% intersections:
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  4.400 ns … 71.699 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     5.690 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   6.875 ns ±  2.997 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂▁▄██▅ ▄▄▃▁     ▅▃▄▄▂     ▁▁                               ▂
  ███████████▇▄▅▆▇█████▄▄▆▇███▇▅▆██▅▆▁▁▅▅▃▄▃▁▁▃▃▅▆▇▇▅▅▇▇█▇██ █
  4.4 ns       Histogram: log(frequency) by time     20.9 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

Benchmarking `intersects` using values in whole Float64 space with 4.3% intersections:
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):   6.030 ns … 88.128 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     22.840 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   23.460 ns ±  9.845 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                            ▂▇█▆▂                              
  ▁▃▅█▆▄█▂▄▃▄▃▂▂▂▂▂▂▄▇▅█▇▃▃▆█████▇▄▃▂▁▁▁▁▁▂▂▃▃▃▃▂▃▃▃▃▄▄▆▇▇▇▄▃ ▃
  6.03 ns         Histogram: frequency by time        41.5 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

Using random coordinates restricted to [-1,1], with 24% of cases having an intersection, paul_bourke was about 50% faster than intersects:

Benchmarking `paul_bourke` using values in [-1,1] with 23.799999999999997% intersections:
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  4.390 ns … 18.599 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     5.499 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   6.422 ns ±  1.712 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

         ▅  █                                                 
  ▂▂▂▂▂▂▃█▅██▇▅▆▃▁▂▃▄▃▃▃▂▂▂▂▂▁▂▁▁▂▁▂▂▁▂▂▁▂▂▂▂▂▁▂▄▇▃▃▃▃▃▃▂▃▃▃ ▃
  4.39 ns        Histogram: frequency by time        10.1 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

Benchmarking `intersects` using values in [-1,1] with 23.799999999999997% intersections:
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):   6.530 ns … 91.319 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):      9.650 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   13.078 ns ±  5.775 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

     ▂▅    █▁                                                  
  ▂▂▂██▆▃▂▅██▃▄▃▂▂▂▂▁▂▂▂▃▃▃▃▃▃▃▃▃▃▃▃▃▂▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▃▃▃▃▃▃▂ ▃
  6.53 ns         Histogram: frequency by time        24.6 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

Using only lines that do intersect, paul_bourke was about 3 times faster:

Benchmarking `paul_bourke` using values in whole Float64 space with 100.0% intersections:
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
 Range (min … max):   8.297 ns … 82.421 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):      9.168 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   10.841 ns ±  3.868 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

    ▃█▅▅▅▃▃▂                                     ▁▂▁▂▂▂▂▂▂▁▂▂ ▂
  █▅████████▁▁▁▄▄▃▁▃▁▃▅▃▄▁▁▃▁▁▁▁▁▄▁▁▁▁▁▁▁▁▁▄▇▆▇▇█████████████ █
  8.3 ns       Histogram: log(frequency) by time      21.3 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

Benchmarking `intersects` using values in whole Float64 space with 100.0% intersections:
BenchmarkTools.Trial: 10000 samples with 997 evaluations.
 Range (min … max):  17.150 ns … 66.699 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     37.321 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   31.479 ns ±  8.701 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

           ▂▅█▇▆▃▁                                  ▂▅▇█▆▅▂    
  ▂▁▁▂▂▃▃▄▇███████▇▅▄▃▃▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▂▂▂▂▃▄▅▆████████▇▄ ▄
  17.2 ns         Histogram: frequency by time        42.2 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.
SimonDanisch commented 6 months ago

Thank you @cmcaine and sorry for taking so long. I'm not familiar with line intersection algorithm and have little time for review. Thanks everyone for sanity checking this, having a few more eyes on this definitely helps a lot to actually merge it.

cmcaine commented 6 months ago

Thanks knuesel!

Thanks very much Simon for your work on this package (which I maybe use indirectly?) and on many other packages that I and others benefit from 💜.

I lost a few hours to this bug and in writing the PR, but you've saved me many more than that over the years :D