This change moves all the per-slot shared state (generation, ref count,
and removal state) into a single AtomicUsize. This has several
advantages:
It reduces the overall complexity of the Slot type, as it no longer
depends on the complex interactions of multiple atomics. The loom
tests are now much faster (which is also a nice sign of relative
complexity, IMO), and the code is easier to reason about.
All interactions with the generation will now involve a RMW. Even
when the generation is not being modified, we will always perform a
read-modify-write with that generation to update some part of the
state (such as the ref count or removal state). If this RMW fails
because our view of the generation is stale, we'll re-acquire the
state, and see that the generation has changed. This will ensure
that the generation counter actually guards against reads with a
a stale generation.
Generation ops need no longer be sequentially consistent.
Slots are a word smaller :)
There isn't really any noticeable performance impact before/after. The
"after" benchmarks are generally about ~2-5% faster across the board,
but I'm not sure if this is really significant (even though Criterion
claims it is).
This change moves all the per-slot shared state (generation, ref count, and removal state) into a single
AtomicUsize
. This has several advantages:Slot
type, as it no longer depends on the complex interactions of multiple atomics. The loom tests are now much faster (which is also a nice sign of relative complexity, IMO), and the code is easier to reason about.There isn't really any noticeable performance impact before/after. The "after" benchmarks are generally about ~2-5% faster across the board, but I'm not sure if this is really significant (even though Criterion claims it is).