georust / geo

Geospatial primitives and algorithms for Rust
https://crates.io/crates/geo
Other
1.49k stars 194 forks source link

Implement `SpadeBoolops` trait #1089

Open RobWalt opened 9 months ago

RobWalt commented 9 months ago

Relevant commits start from 409b815c90fcb037dfddd0a38062ed2b89082f86 This PR depends on #1083 and #1087

Feel free to test the latest version of the algo with this app

... and report errors if you find some 🤞🏼

Description

This is a draft for a new algorithm for boolops.

It was created because the current implementation in the repo has some issues with panics. This spawns a lot of issues in the repository.

The new algorithm might not be the most performant candidate. Still, it poses a good alternative for scenarios in which not every nano second is counted. Overall, I couldn't notice any performance issues in my application yet. That being said, I'm still open for a discussion about performance if that's a critical thing for you.

Algorithm on a high level

The algorithm itself is dead simple:

  1. It triangulates the combined geometry of all it's arguments

image

  1. Based on the chosen boolop, it runs a predicate on each triangle (let's choose intersection here, so the predicate is that the triangles mid point (mid point for stability reasons) is contained in both original shapes). All triangles that don't pass the predicate are discarded

image

  1. We use the new Stitch trait to reassemble the triangles into MultiPolygons

rmanoka commented 9 months ago

@RobWalt this is a cool implementation! I don't fully understand the correctness yet, but are you getting this to pass all the JTS test fixtures that we run the current bool. ops. on? If so, I'm happy to merge this as the default boolean op.

RobWalt commented 9 months ago

@rmanoka How would you define "passing the current tests"? My algorithm won't find the outputs specified in these tests with the points in the linestrings in the exact same order. Is that allowed?

RobWalt commented 9 months ago

I ported the tests under the name "legacy_tests". They all run through. There is a small caveat though: The 894 test runs awfully long. (~18s for 3 intersections on my machine). This is probably due to the fact that the new algo isn't very efficient for bigger geometries.

urschrei commented 9 months ago

I ported the tests under the name "legacy_tests". They all run through. There is a small caveat though: The 894 test runs awfully long. (~18s for 3 intersections on my machine). This is probably due to the fact that the new algo isn't very efficient for bigger geometries.

How long does the test take using the existing bool ops? How long in release mode? I don't think worse perf than the existing impl is a problem, but I'm interested in the performance characteristics more generally.

RobWalt commented 9 months ago

I ported the tests under the name "legacy_tests". They all run through. There is a small caveat though: The 894 test runs awfully long. (~18s for 3 intersections on my machine). This is probably due to the fact that the new algo isn't very efficient for bigger geometries.

How long does the test take using the existing bool ops? How long in release mode? I don't think worse perf than the existing impl is a problem, but I'm interested in the performance characteristics more generally.

It's

So you really should consider if you want to use that and if the input is ok for it. I might add a warning. I can also look into subtle optimizations (via bounding boxes) to squeeze out a bit of performance.

Please note that the Multipolygons in this example are absolutely giant (23 KB, 150 polygons per Multipolygon, 2600 vertices total per Multipolygon)

michaelkirk commented 9 months ago

Feel free to test the latest version of the algo with this app

Lovely demo! The source is this right? https://github.com/RobWalt/is-geo-boolops-still-broken/blob/main/src/bin/is-geo-boolops-still-broken.rs

RobWalt commented 9 months ago

I implemented some minor bounding box optimizations and now the big test runs through in

4.5s instead of 13s

There were two test results I needed to change for it still to work:

dabreegster commented 6 months ago

I'm now hitting crashes with the current implementation in a web-only rewrite of one of my main projects (https://github.com/dabreegster/ltn) and will switch over to this branch soon. Anything I can do to help the Stitch PR and this one get through? I don't see pending tasks in either one; would another review be useful? (Not that I'm any good at georust internals/math as a reviewer)

michaelkirk commented 6 months ago

Sorry that the review train kind of stalled out on this! It's hard to review big algorithms, especially when they are novel like this one.

Anything I can do to help the Stitch PR and this one get through?

If you're brave enough to integrate and use this branch in one of your applications with real world data for a while, that would definitely build some confidence for me.

I'll also take a pass at reviewing the stitch PR now.

dabreegster commented 6 months ago

If you're brave enough to integrate and use this branch in one of your applications with real world data for a while, that would definitely build some confidence for me.

Oops, for the project I was thinking of, actually I'm using clip, which doesn't have an equivalent yet. Intersecting a LineString with a Polygon yields a SpadeBoolopsResult = Result<MultiPolygon<T>, SpadeBoolopsError>, and recovering the clipped linestring doesn't seem possible.

I'll instead cutover the A/B Street repo (with the LTN tool particularly) to use this, but it'll take more time to get to it. :\

rmanoka commented 2 hours ago

I'm for merging this PR too.

Reg. the failing tests, could we adjust the code that checks the outputs to accept equivalent polygons? I think we have such test glue-code that checks equiv. upto an adjustible $\epsilon$ factor.

Reg. the increase in time, I dont have a great opinion, but we could at least expose this impl. as an additional impl. the users can use.

Reg. the impl., I only scanned through this, but IIUC, the key pre-processing involves rounding the inputs to some $\epsilon$ threshold (so for eg. all coords are "fmod(c, $\epsilon$) = 0"). In this case, may be we could just do this pre-processing, and run the current implementation itself? I don't know if that would pass all the cases this version does, but that may be a faster impl.