mmtk / mmtk-core

Memory Management ToolKit
https://www.mmtk.io
Other
379 stars 69 forks source link

Introduce WorkerLocal #1233

Open wks opened 1 week ago

wks commented 1 week ago

Current use pattern

One use pattern seen in MMTk-core is a vector where each element holds data accessible by exactly one worker.

One example is the worker_local_freed_blocks: Vec<BlockQueue<B>> field of Blockpool. Each BlockQueue inside is accessible by exactly one GCWorker thread in mmtk-core. Each worker thread selects its element using crate::scheduler::current_worker_ordinal() as the index. In this particular case, BlockQueue contains thread-unsafe operations that are marked as unsafe. One example is push_relaxed.

impl<B: Region> BlockPool<B> {
    pub fn push(&self, block: B) {
        self.count.fetch_add(1, Ordering::SeqCst);
        let id = crate::scheduler::current_worker_ordinal(); // Get the index
        let failed = unsafe {
            self.worker_local_freed_blocks[id]
                .push_relaxed(block) // Perform unsafe operation in the `unsafe` block.
                .is_err()
        };
        // handle failure here...
    }
}

p.s. Interestingly, this is the sole call site of the current_worker_ordinal() function in mmtk-core.

However, if we always obey the use pattern that each element of the self.worker_local_freed_blocks vector is only accessed by the corresponding GC worker, then there will be nothing unsafe.

Proposal

I propose a data structure WorkerLocal<T> that wraps Vec<T>.

Every worker can get its own element safely.

let my_elem: &T = self.some_worker_local.my();
my_elem.query_something();

Internally the my() method will call crate::scheduler::current_worker_ordinal() and use unsafe to return a reference.

It may be accessed mutably.

let my_elem_mut: &mut T = self.some_worker_local.my_mut();
my_elem_mut.change_something(42);

However, it may be unsafe, too, because my_mut will not be able to check if a mutable reference for the same worker is obtained twice. For example

let my_elem_mut1: &mut T = self.some_worker_local.my_mut();
let my_elem_mut2: &mut T = self.some_worker_local.my_mut(); // not safe
my_elem_mut1.change_something(42);
my_elem_mut2.change_something(43);

Having two mutable references to the same object is problematic in Rust.

Related work

std::thread_local! is more general. But since it can be used by all threads, it has to be hashed by the TID (or more expensive handles), and will not be as efficient as an array indexed by GCWorker::ordinal. Only shared references &T can be obtained from std::thread_local!, so typical use cases of thread_local! usually involve Cell or RefCell, which involve one run-time check for each access.