Closed Protowalker closed 3 years ago
Hi, thanks for attempting this :)
3% may not seem like a lot but it still is a fairly tangible difference if it's consistent (3% here, 3% there add up to quite a lot in the end). I would suggest testing it mostly on canada.txt and uniform and, as extreme case, on mesh.txt.
I totally agree that a consistent 3% advantage in such a low-level libraray justifies the usage of unsafe if unavoidable. But I think pow10_fast_path
itself should be unsafe with an additional comment what the invariants are (exponent < 10) that the caller needs to uphold and a comment on unsafe block at the callside that this unsafe block is fine because you checked it beforehand.
Nice work, btw :)
Can I come clean about something? I had only tested my other theory about a speed increase and not this one -- then I wrote this one, and then I made the PR, and then before I hit submit I realized I hadn't run benchmarks, so I ran some and when I realized it was slower I just changed the title from "improving performance" to "with low performance cost" :grimacing:
I'm gonna do another implementation with the idea that I have tested but requires a larger rewrite than a couple lines, then I'll open this up again. Cheers!
OLD:
min/median/max
ns/float: 22.29/23.64/28.55 Mfloat/s: 44.86/42.31/35.03 MB/s: 781.57/737.14/610.28
NEW: ns/float: 22.9/24.94/30.69 Mfloat/s: 43.67/40.10/32.59 MB/s: 760/87/698.71/567.77
I'm going to continue working on this to try and improve it as much as possible.