Modulo is faster for a non-zero divisor, so we can use the NonZeroUsize type to hint to the compiler that it can omit the case where the capacity is 0. However, we can only do that in non-const context, so we need separate functions for the ConstGenericRingBuffer and AllocRingBuffer. Which led me to implement it like this:
// Not const: used by AllocRingBuffer<T, NonPowerOfTwo>
#[inline]
fn mask_modulo(cap: usize, index: usize) -> usize {
index % unsafe { NonZeroUsize::new_unchecked(cap) }
}
// Const: used by ConstGenericRingBuffer
#[inline]
const fn const_mask_modulo(cap: usize, index: usize) -> usize {
index % cap
}
This theoretically should only impact AllocRingBuffer for non powers of two. The const_mask_modulo should compile to the same assembly as before, but who knows what the compiler might do. The benchmarks are a bit too noisy to tell.
The improvements for alloc non power of two are significant, but less significant than https://github.com/NULLx76/ringbuffer/pull/114: it ranges from -10% to -30%. @jonay2000's more consistent benchmarks might help to determine the actual difference.
I'll leave it up to you what direction you want to go between this and #114 in (if any).
This is an alternative to https://github.com/NULLx76/ringbuffer/pull/114. They are not compatible, sadly.
Modulo is faster for a non-zero divisor, so we can use the
NonZeroUsize
type to hint to the compiler that it can omit the case where the capacity is0
. However, we can only do that in non-const
context, so we need separate functions for theConstGenericRingBuffer
andAllocRingBuffer
. Which led me to implement it like this:This theoretically should only impact
AllocRingBuffer
for non powers of two. Theconst_mask_modulo
should compile to the same assembly as before, but who knows what the compiler might do. The benchmarks are a bit too noisy to tell.The improvements for alloc non power of two are significant, but less significant than https://github.com/NULLx76/ringbuffer/pull/114: it ranges from -10% to -30%. @jonay2000's more consistent benchmarks might help to determine the actual difference.
I'll leave it up to you what direction you want to go between this and #114 in (if any).