faster `HashCommon.nextPowerOf2` - Githubissues

vigna / fastutil

fastutil extends the Java™ Collections Framework by providing type-specific maps, sets, lists and queues.

Apache License 2.0

1.81k stars 199 forks source link

faster `HashCommon.nextPowerOf2` #281

Closed richardstartin closed 2 years ago

richardstartin commented 2 years ago

The integer version of this method showed up in a profile, and I was curious enough to look at its implementation. This provides a slightly faster implementation of HashCommon.nextPowerOf2 which takes advantage of numberOfLeadingZeros intrinsics.

I wrote a very simple benchmark to justify the change. While it shows that the existing code is hardly worth optimising, the new code is indeed faster:

Benchmark                     Mode  Cnt  Score   Error  Units
NextPowerOf2Benchmark.after   avgt    5  0.553 ± 0.007  ns/op
NextPowerOf2Benchmark.before  avgt    5  0.916 ± 0.002  ns/op

richardstartin commented 2 years ago

@vigna Since there doesn't appear to be CI running on this change, I feel it's worth mentioning I've run the test suite and everything passed.

vigna commented 2 years ago

Very interesting how it handles the case 0. OK, but, just out of curiosity, did you try to benchmark with GraalVM?

richardstartin commented 2 years ago

Very interesting how it handles the case 0. OK, but, just out of curiosity, did you try to benchmark with GraalVM?

I did not, but I can try that and let you know.

richardstartin commented 2 years ago

I got roughly the same scores (JDK 17.0.5, OpenJDK 64-Bit Server VM, 17.0.5+8-jvmci-22.3-b08, on an M1):

Benchmark                     Mode  Cnt  Score   Error  Units
NextPowerOf2Benchmark.after   avgt    5  0.647 ± 0.109  ns/op
NextPowerOf2Benchmark.before  avgt    5  0.965 ± 0.002  ns/op

vigna commented 2 years ago

OK thanks!

richardstartin commented 2 years ago

Oh no! I thought this was an elegant solution (looking for a problem, of course) :(

vigna commented 2 years ago

Why no? I just merged...

vigna commented 2 years ago

Ah OK for some weird reason I did not merge properly.