georust / geographiclib-rs

A port of geographiclib in Rust.
MIT License
41 stars 9 forks source link

Standardized to usize #57

Closed valarauca closed 9 months ago

valarauca commented 9 months ago

Remove instances of i64 & isize when indexing, standardized on usize


Per https://github.com/georust/geographiclib-rs/pull/56#issuecomment-1908740620

Attempting to reducing the scope of the PR.

This only addresses moving indexing variables to usize.

michaelkirk commented 9 months ago

Could you run cargo fmt and fix the new clippy errors?

valarauca commented 9 months ago

Squashing locally broke because of the upstream commits, idk how to fix that.

Clippy demanded I do a lot of things outside of the scope of the original change

michaelkirk commented 9 months ago

Hmmm, I'm actually seeing a perf regression with this PR

$ cargo bench --bench="*" --  --baseline=main-2024-01-24
direct (c wrapper)/default
                        time:   [23.918 µs 23.956 µs 24.004 µs]
                        change: [+0.2632% +0.4314% +0.5961%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

direct (rust impl)/default
                        time:   [30.104 µs 30.139 µs 30.180 µs]
                        change: [+0.2469% +0.4175% +0.5814%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  2 (2.00%) high severe

inverse (c wrapper)/default
                        time:   [44.610 µs 44.664 µs 44.723 µs]
                        change: [-2.4267% -0.9378% +0.0711%] (p = 0.16 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low severe
  5 (5.00%) high mild
  1 (1.00%) high severe

inverse (rust impl)/default
                        time:   [78.354 µs 78.452 µs 78.564 µs]
                        change: [+5.8279% +6.0227% +6.2028%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
  3 (3.00%) low mild
  5 (5.00%) high mild
  3 (3.00%) high severe

If I revert the clippy/format commit (877afcc) I see both an improvement and a regression:

$ cargo bench --bench="*" --  --baseline=main-2024-01-24

direct (c wrapper)/default
                        time:   [23.948 µs 24.010 µs 24.088 µs]
                        change: [+0.1347% +0.3540% +0.5856%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
  6 (6.00%) high mild
  2 (2.00%) high severe

direct (rust impl)/default
                        time:   [29.044 µs 29.085 µs 29.126 µs]
                        change: [-3.2355% -3.0412% -2.8588%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) low mild
  2 (2.00%) high mild
  1 (1.00%) high severe

inverse (c wrapper)/default
                        time:   [44.673 µs 44.745 µs 44.820 µs]
                        change: [-2.3927% -0.9049% +0.1063%] (p = 0.19 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) low mild
  3 (3.00%) high mild
  3 (3.00%) high severe

inverse (rust impl)/default
                        time:   [77.100 µs 77.238 µs 77.404 µs]
                        change: [+4.1364% +4.3271% +4.5071%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low severe
  6 (6.00%) high mild

Any idea why?

valarauca commented 9 months ago

If I revert the clippy/format commit (https://github.com/georust/geographiclib-rs/commit/877afcc73acb73f0bdc1929787beb95750850789) I see both an improvement and a regression: Any idea why?

My best guess is the change is within the noise threshold. I commonly see +/-3µs changes without a code change locally (even with n=1000).

I've taken steps to try to reduce this noise (closing applications, changing cpu throttling/power options) but it remains frustratingly common result.

michaelkirk commented 9 months ago

Noise is a real thing, but the consistency with which I can reproduce the relative behavior leads me to believe this is not only noise in the benchmark.

Are you able to reproduce this?

michaelkirk commented 9 months ago

The regressions from the clippy/fmt commit seem to be from changing the loop iteration.

e.g. this branch https://github.com/georust/geographiclib-rs/tree/mkirk/usize (as of https://github.com/georust/geographiclib-rs/commit/20a2b333570905a54296f23928644722fb29a93f) has the same perf characteristics as before your clippy changes.

That is:

michaelkirk commented 9 months ago

Thanks! I included this in #58 (but reverted part of the clippy changes).

valarauca commented 9 months ago

Are you able to reproduce this?

Sort of...

Looking at the ASM output link

The main different I see is the normal index based loop does a pretty simple structure

BELOW_18 -> POLY_SUM (inner loop) -> BELOW_18 OR  RET

While the clippy version does

 IDK -> BELOW_18 -> POLY_SUM (inner loop) -> IDK | RETURN_OR_PANIC

This makes the clippy version extremely branch heavy, as it seems to frequently branch off check a bunch of stuff, then return to the inner loop.

michaelkirk commented 9 months ago

Interesting! That might account for it.