georust / geo

Geospatial primitives and algorithms for Rust
https://crates.io/crates/geo
Other
1.48k stars 191 forks source link

Avoid `atan2` in `minimum_rotated_rect` #1094

Open lnicola opened 8 months ago

lnicola commented 8 months ago
lnicola commented 8 months ago

No benchmarks yet, and the output might be less precise, not sure if we want to merge this, but I wanted to put the code out.

frewsxcv commented 7 months ago

@lnicola Can you add more context to the description of the PR (or in a comment) about why we're avoiding atan2?

urschrei commented 7 months ago

(This was all my idea). Atan2 is a comparatively very expensive operation (around 100 cpu cycles on x86, vs around 20 for something like sqrt). Using this approach (inspired by the approach from https://gis.stackexchange.com/a/22934) avoids that. Unfortunately, it's currently suffering from floating point precision issues.

frewsxcv commented 5 months ago

Do we see any path forward with the precision issues here? If not, @lnicola what do you think about converting this to a draft PR or closing it?

lnicola commented 5 months ago

I finally got to run some benchmarks (on the plain version, not the one using FMA):

minimum rotated rect f64/old/10
                        time:   [1.2758 µs 1.2777 µs 1.2798 µs]
minimum rotated rect f64/new/10
                        time:   [648.70 ns 649.96 ns 651.92 ns]
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) low mild
  1 (1.00%) high severe
minimum rotated rect f64/old/100
                        time:   [32.838 µs 32.873 µs 32.908 µs]
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe
minimum rotated rect f64/new/100
                        time:   [18.851 µs 18.854 µs 18.857 µs]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe
Benchmarking minimum rotated rect f64/old/1000: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.7s, enable flat sampling, or reduce sample count to 50.
minimum rotated rect f64/old/1000
                        time:   [1.9078 ms 1.9094 ms 1.9111 ms]
Found 15 outliers among 100 measurements (15.00%)
  11 (11.00%) low mild
  2 (2.00%) high mild
  2 (2.00%) high severe
Benchmarking minimum rotated rect f64/new/1000: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.5s, enable flat sampling, or reduce sample count to 60.
minimum rotated rect f64/new/1000
                        time:   [1.2859 ms 1.2866 ms 1.2873 ms]
Found 12 outliers among 100 measurements (12.00%)
  8 (8.00%) low severe
  2 (2.00%) low mild
  2 (2.00%) high severe

image

So it's about 33% faster, which is noticeable, but not earthshaking.

As for the precision issue, I can't prove it, but I suspect the new method is more precise in some cases and less in others. Looking at the numbers, they're something like:

1.7000000000000006 24.600000000000000
1.7000000000000004 24.600000000000005

14.649854163628408 25.153412571095235
14.649854163628410 25.153412571095235

14.400000000000000 31.000000000000000
14.400000000000002 31.000000000000004 # <- here

1.4501458363715918 30.446587428904767
1.4501458363715916 30.446587428904774

The only reason I suspect the new method is less accurate is that one vertex which is part of the input.


Anyway, I don't know how to improve on this, but I don't insist on merging it either, so we can close this if nobody wants it.

urschrei commented 5 months ago

From a practical perspective MRR calculation tends to be something done at scale (e.g. building footprints in a city etc), so a 33 % improvement on something that might take 10 seconds is pretty respectable. I'm gonna try some more comparisons with JTS and GEOS tomorrow and see whether it's a consistent accuracy issue, although it's hard to see what else we could do without re-introducing complexity.

urschrei commented 5 months ago

input:

MULTIPOINT ((3.3 30.4), (1.7 24.6), (13.4 25.1), (14.4 31), (3.3 30.4))

geo:

POLYGON ((1.7000000000000004 24.600000000000005, 14.64985416362841 25.153412571095235, 14.400000000000002 31.000000000000004, 1.4501458363715916 30.446587428904774, 1.7000000000000004 24.600000000000005))

JTS (HEAD):

POLYGON ((1.3920611798980265 30.296868171886377, 1.7071376547705703 24.467953386744355, 14.715076474872582 25.171085214857957, 14.40000000000005 30.999999999999982, 1.3920611798980265 30.296868171886377))

GEOS (3.12.0)

POLYGON((14.649854163628415 25.153412571095224,1.7 24.6,1.450145836371595 30.44658742890477,14.4 31,14.649854163628415 25.153412571095224))

geo vs JTS (geo is blue) Screenshot 2024-01-31 at 12 35 23

geo vs GEOS (geo is blue) Screenshot 2024-01-31 at 12 37 07

urschrei commented 5 months ago

Oh but wait, GEOS had a bug until 3.12.1: https://github.com/libgeos/geos/commit/2efd5ed41009149db31b896cea98aacf765b004c

I can't easily access 3.12.1, anyone else got it?

lnicola commented 5 months ago

I have 3.12.1, can you send me some code?

urschrei commented 5 months ago

I have 3.12.1, can you send me some code?

SELECT ST_AsText(ST_OrientedEnvelope('MULTIPOINT ((3.3 30.4), (1.7 24.6), (13.4 25.1), (14.4 31.0), (3.3 30.4))'));
lnicola commented 5 months ago

Same as before:

POLYGON((14.649854163628415 25.153412571095224,1.7 24.6,1.450145836371595 30.44658742890477,14.4 31,14.649854163628415 25.153412571095224))

POSTGIS="3.4.1 ca035b9" [EXTENSION] PGSQL="160" GEOS="3.12.1-CAPI-1.18.1" PROJ="9.3.1 NETWORK_ENABLED=OFF URL_ENDPOINT=https://cdn.proj.org USER_WRITABLE_DIRECTORY=/var/lib/postgresql/.local/share/proj DATABASE_PATH=/usr/local/share/proj/proj.db" LIBXML="2.9.14" LIBJSON="0.16" LIBPROTOBUF="1.4.1" WAGYU="0.5.0 (Internal)" TOPOLOGY
urschrei commented 5 months ago

Even though there's no visual difference with 3.12.0 at this resolution:

geo vs GEOS (3.12.1):

Screenshot 2024-01-31 at 13 08 53

lnicola commented 5 months ago

It's exactly the same output as with 3.12.0.

urschrei commented 5 months ago

Yeah I just saw 🤦‍♂️. Did you run it via PostGIS or using the cpp function?

lnicola commented 5 months ago

Using docker.io/imresamu/postgis:16-recent