Should reduce memory footprint. I ran the benchmarks with cargo +nightly bench and didn't notice any slowdown, at least on my arch (x86_64-pc-windows-msvc).
Before:
test discard_100 ... bench: 12,092 ns/iter (+/- 612)
test discard_10000 ... bench: 1,197,697 ns/iter (+/- 26,895)
test keep_100 ... bench: 10,594 ns/iter (+/- 284)
test keep_10000 ... bench: 975,823 ns/iter (+/- 42,083)
After:
test discard_100 ... bench: 11,111 ns/iter (+/- 217)
test discard_10000 ... bench: 1,111,062 ns/iter (+/- 35,206)
test keep_100 ... bench: 10,945 ns/iter (+/- 773)
test keep_10000 ... bench: 1,052,570 ns/iter (+/- 53,772)
Incrementing and decrementing roots requires some more bit operations but hopefully those are pretty cheap? Also instead of checked_add() I use a branch, I don't fully know what the performance implications of that would be. I imagine not much because its probably a very predictable branch.
Should reduce memory footprint. I ran the benchmarks with
cargo +nightly bench
and didn't notice any slowdown, at least on my arch (x86_64-pc-windows-msvc).Before:
After:
Incrementing and decrementing roots requires some more bit operations but hopefully those are pretty cheap? Also instead of
checked_add()
I use a branch, I don't fully know what the performance implications of that would be. I imagine not much because its probably a very predictable branch.