Rewrite BlockPool using double-checked locking pattern

Currently, BlockPool::head_global_freed_blocks is an RwLock<Option<BlockQueue<B>>>, despite the fact that BlockQueue already supports lock-free atomic pop() operation which is implemented using fetch_update.

We may rewrite BlockPool using the double-checked locking pattern. On the fast path, we simply do an atomic saturated-sub to get the index of the next available element. If there is no available element, we fall back to the slow path. In the slow path, we shall lock the global_freed_blocks, check if head_global_freed_blocks is still empty (the "double check" part because another thread may have entered the slow path, too). It it still is, the current thread shall move one BlockQueue from global_freed_blocks to head_global_freed_blocks. It shall populate the BlockQueue::data array before atomically setting BlockQueue::cursor to the proper length so that other threads know the head_global_freed_blocks has been re-populated.

Not sure how much performance benefit this would bring because this is on the slow path. But for allocation-intensive work loads, this may accelerate the allocation slow path which still takes up a significant amount of time.

mmtk / mmtk-core

Rewrite BlockPool using double-checked locking pattern #1234