Currently, BlockPool::head_global_freed_blocks is an RwLock<Option<BlockQueue<B>>>, despite the fact that BlockQueue already supports lock-free atomic pop() operation which is implemented using fetch_update.
We may rewrite BlockPool using the double-checked locking pattern. On the fast path, we simply do an atomic saturated-sub to get the index of the next available element. If there is no available element, we fall back to the slow path. In the slow path, we shall lock the global_freed_blocks, check if head_global_freed_blocks is still empty (the "double check" part because another thread may have entered the slow path, too). It it still is, the current thread shall move one BlockQueue from global_freed_blocks to head_global_freed_blocks. It shall populate the BlockQueue::data array before atomically setting BlockQueue::cursor to the proper length so that other threads know the head_global_freed_blocks has been re-populated.
Not sure how much performance benefit this would bring because this is on the slow path. But for allocation-intensive work loads, this may accelerate the allocation slow path which still takes up a significant amount of time.
Currently,
BlockPool::head_global_freed_blocks
is anRwLock<Option<BlockQueue<B>>>
, despite the fact thatBlockQueue
already supports lock-free atomicpop()
operation which is implemented usingfetch_update
.We may rewrite
BlockPool
using the double-checked locking pattern. On the fast path, we simply do an atomic saturated-sub to get the index of the next available element. If there is no available element, we fall back to the slow path. In the slow path, we shall lock theglobal_freed_blocks
, check ifhead_global_freed_blocks
is still empty (the "double check" part because another thread may have entered the slow path, too). It it still is, the current thread shall move oneBlockQueue
fromglobal_freed_blocks
tohead_global_freed_blocks
. It shall populate theBlockQueue::data
array before atomically settingBlockQueue::cursor
to the proper length so that other threads know thehead_global_freed_blocks
has been re-populated.Not sure how much performance benefit this would bring because this is on the slow path. But for allocation-intensive work loads, this may accelerate the allocation slow path which still takes up a significant amount of time.