Open wks opened 1 year ago
Yeah. I thought about this today as well. We had an example in the Rust immix paper. The global space gives the ownership of Block
to each allocator, and when an allocator exhausts a Block
or when an allocator is requested to stop for a GC, they return the Block
back to the space. In a GC, the space sweeps the Block
lists. However, this is a simplified model, and what happens in MMTk is more complicated than that.
Note: I am not proposing any changes to https://github.com/mmtk/mmtk-core/pull/643. The algorithm works, and its style is consistent with other parts of mmtk-core. I am talking about a more general issue in mmtk-core.
TL;DR: We have
Chunk(Address)
,Block(Address)
,Line(Address)
andPage(Address)
. They implementCopy
, and has pointer semantics, but does not own the region they point to. This approach is not idiomatic in Rust. It does not prevent two pointers to point to the same region mutably, and the programmers need to add hard-to-reason-about locks even when they have&mut
access. We may need to introduce a set of owning types that own those regions.Background
Rust ownership and references
Rust has
&T
,&mut T
andT
.&mut T
is an exclusive (unique) borrowing reference, and it grants read-write access.By enforcing the borrowing rules, Rust programmers are confident that if a variable holds a
&mut T
, it is the only variable that refers to theT
, and it has read-write access to it. No other&mut T
, not even&T
, could coexist. For this reason, Rust often advertise as having "fearless concurrency". If you respect the borrowing rules, there can be no data race.Region types
The
Region
trait represents an aligned region of memory of a certain size. Instances includeChunk(Address)
,Block(Address)
,Line(Address)
andPage(Address)
. The PR https://github.com/mmtk/mmtk-core/pull/643 also introduces a new typeBlock(NonZeroUsize)
. All of them are wrappers of addresses.Region
instances implementCopy
. They can be copied just like C pointers. And they can be constructed from addresses. In this way, given an address range, we can iterate through all aligned regions in that range. Each region is constructed from the address.As we can see, in each iteration, we create a
Block
instance from an address, and access the block mutably (sweep
).This means any thread can gain read-write access to a
Block
without borrowing or getting ownership, and dangers await.The problem
The mysterious lock
When reviewing the PR https://github.com/mmtk/mmtk-core/pull/643, I noticed an interesting piece of code.
This code snippet tries to remove a
Block
from aBlockList
(a doubly-linked list), but another thread may have moved theBlock
to anotherBlockList
concurrently. So it has to try to lock theBlockList
in a loop until it successfully locked theBlockList
and theBlock
is still in thatBlockList
.This piece of code puzzled me in two ways.
*mut T
, why there is still a race? Why don't we have unique access to it? (OK. It is pointer, not reference. But it is still counter-intuitive to Rust programmers.)To answer question (2), I found the two places that tries to acquire the same lock. One (the shown snippet) is sweeping chunks, and the other is releasing mutators.
During the
Release
stage, the MarkSweep collector does two things concurrently.a. For all chunks, find all blocks in it that are unmarked, and release them from their BlockList (the shown snippet). b. For all mutators, transfer all blocks from the
available_blocks
BlockList and theconsumed_blocks
BlockList into theunswept_blocks
BlockList.When doing both concurrently, each block may be reached from two different places, namely through chunks and through mutators. If not synchronised properly, it may leave the doubly-linked list in an inconsistent state. Therefore, a spinlock is added to the
BlockList
struct, and used to prevent this race. The lock is obtained using thelock()
method, as shown in the snippet above.The lock solved the racing problem here. But I am curious why this problem still occurs in Rust. Shouldn't Rust protect us from this kind of race?
Why does this happen?
The root cause is that
Block
implementsCopy
and can be created from addresses. It has pointer semantics. It points to aBlock
, but does not represent ownership. So all the ownership-transfer and borrow-checking mechanisms stopped working in this use case. Any thread can gain mutable access by simply creating aBlock
instance.The SweepChunk work packet uses
RegionIterator<Block>
over theChunk
to gain access. It bypassed the borrow checker, allowing two threads to hold&mut BlockList
simultaneously.But why we don't need locks in other places?
If the
Chunk
andBlock
types are inherently unsafe, I would expect the PR https://github.com/mmtk/mmtk-core/pull/643 has lock/unlock operations everywhere. But it's not the case.BlockList
is usually accessed without locking.A
BlockList
represents a list ofBlock
of certain size. It has several mutable methods:remove()
,push()
,pop()
,append()
andreset()
.alloc()
, the mutator tries topop()
a block from theBlockList
in theavailable_blocks
array owned by the allocator which is owned by the current mutator. Since the mutator is callingpop()
on aBlockList
it already owns, it shall not cause data race.pop()
aBlock
from its localunswept_blocks
list, sweep it, and retry allocating from that block. It is still accessing blocks and block lists it owns, so there is still no data race.MarkSweepSpace::abandoned
which is aMutex<AbandonedBlockLists>
. This forces all mutators to acquire a lock before accessing the lists within, and the lock prevents data race.Observed from this analysis, we see our MiMalloc implementation is actually a good example of using ownership to prevent data race. The fast path accesses owned data so there can't be data race; the slow path accesses global data protected by Mutex so it prevents data race; once the blocks are transferred from the global pool to the local BlockList, subsequent allocations from that block will be local, and can't have data race, either.
Unfortunately, the ownership model is only in the developers' head, but not reflected in the code.
Block
implementsCopy
, so once assigned, the original owner may still keep aBlock
instance that represents the same block. In Rust, this shouldn't happen. Once the ownership is transferred, the original owner no longer has access to it. This ensures the global pool cannot give the sameBlock
to two different mutators.Throughout the code,
Block
is treated as a pointer. Therefore, whether to use&self
or&mut self
is only a matter of whether it mutates theblock.0
field of theBlock(NonZeroUsize)
instance. We can see thatBlock::sweep(&self)
actually takes a&self
as parameter, even though it writes into the block.I would not disapprove the MiMalloc PR for this, because this is common in MMTk core. The
immix::Block
is the same. All of its methods take&self
parameter, includingimmix::Block::sweep(&self)
.So what kind of
Block
type should we have?We need something that represents the ownership of a block. For this, the
Block
type should not implementClone
orCopy
.MarkSweepSpace::acquire_block
should be the only place whereBlock
instances can be created, and it createsBlock
from the address returned from PageResource when allocating "fresh" blocks.Block
instances are destroyed inBlock::attempt_release(self)
(:thumbup: to the owningself
parameter), when its underlying pages are released back to the PageResource. When transferring aBlock
from oneBlockList
to another, or from the global pool to the mutator, Rust's ownership model will ensure no two mutators share the sameBlock
, and even no twoBlockList
can hold the sameBlock
at the same time.What about sweeping all blocks in a
Chunk
? From Rust's point of view, the mutator owns the block, because theMarkSweepSpace
createsBlock
from the address returned from PageResource, and transfers it to the mutator. Mutators, however, can abandon blocks so they are transferred toMarkSweepSpace
. If eachBlock
is owned by either a mutator orMarkSweepSpace
, I would argue that it is completely unnecessary to sweep blocks by iterating chunks. Instead, we can simply iterate through each block in each mutator and theMarkSweepSpace
, see if it is marked, and remove it from itsBlockList
. That's all the blocks out there. This can be parallelised by creating one work packet for each mutator, and one more for the globalMarkSweepSpace
. And we don't need any lock on theBlockList
, because eachBlockList
is also owned by exactly one mutator (or theMarkSweepSpace
) and the race is impossible.Known issues
Is this approach applicable to other GC algorithms, like Immix?