Improve performance of filling long value with a single byte

skrzypo987 commented 3 years ago

SImple benchmarks show 600/s -> 1300/s

sopel39 commented 3 years ago

cc @martint @electrum

martint commented 3 years ago

Have you seen any cases where this matters? This method is called once per invocation to Slice.fill(), and the bulk of the time spent in that method is replicating the long across the whole buffer. I suspect it will only have an impact when making lots of calls to Slice.fill() on very small slice objects.

skrzypo987 commented 3 years ago

Have you seen any cases where this matters? This method is called once per invocation to Slice.fill(), and the bulk of the time spent in that method is replicating the long across the whole buffer. I suspect it will only have an impact when making lots of calls to Slice.fill() on very small slice objects.

I observed an actual gain with slices of size ~100 bytes. The difference is small but measurable. I used only slices with sizes that are a multiple of 8 so I guess either JIT or the CPU branch prediction got rid of the second loop in the fill method making this matter slightly more.

airlift / slice

Improve performance of filling long value with a single byte #143