Open qinsoon opened 4 years ago
TL;DR: The VM controls the lifetime of MMTK
and Mutator
, not mmtk-core. For safety, mmtk-core need to do dynamic checking like RefCell
does. However, we need to find a container type C<T>
that satisfies the properties discussed below.
I have been thinking about what is the proper way to express the lifetimes of MMTK, Plan and Mutator instances. My current conclusion is that their lifetimes are dynamic, and cannot be statically checked by Rust. More precisely, their lifetimes are controlled by the VM, not by mmtk-core itself.
Currently mmtk-core requires the MMTK instance to be static. By doing this, there is no correctness issues now, but when introducing multiple MMTK instances, the lifetime of MMTK, Plan and Mutator become important.
In theory, lifetime(Plan) == lifetime(MMTK)
, and ∀ Mutator. lifetime(Mutator) < lifetime(Plan)
(a < b
means lifetime a
is contained within lifetime b
). Violating this (destroying the MMTK instance while a mutator is still running) results in error. When lifetime(MMTK) == 'static
, the above predicates always hold.
But MMTK is, by nature, controlled by the VM via the API. So
the lifetime of MMTK and Plan:
create_mmtk
(a hypothetical API)destroy_mmtk
(a hypothetical API)and the lifetime of Mutator:
bind_mutator
destroy_mutator
So if a buggy VM insists on calling the API in the following sequence:
MMTk* mmtk = create_mmtk(&config);
MMTk_Mutator* mutator = bind_mutator(tls);
destroy_mmtk(mmtk); // ERROR: mutator is still alive
destroy_mutator(mutator); // too late
then the third line destroy_mmtk(mmtk)
will result in an error, but the Rust mmtk-core has no static way to prevent C code from making such a mistake. An unsafe API can just let the third line have undefined behaviour, while a "safe" API may return an error code, or just panic (at least it fails fast).
This means, Rust's static life-time checker cannot help. We could just use unsafe pointers. However, it is possible to dynamically check the ownership, and panic
when a lifetime error are detected.
C<T>
that dynamically check for lifetimeRefCell
is an example of dynamic borrow checking. It internally maintains a borrow status which includes a borrow counter. The user may use the borrow
or the borrow_mut
methods to borrow the content immutably or mutably. But if any invocation of borrow
of borrow_mut
violated the rule that "there may be either many immutable borrows or exactly one mutable borrow, but not both", it panics.
Similarly, we could use a kind of container (let's call it C<T>
, like Box<T>
, Arc<T>
or RefCell<T>
) that ensures:
C<T>
has a unique owner. The owner determines when the object is freed.borrow()
method. The borrow()
method returns a borrow handle (let's call it CRef<T>
, just like RefCell<T>::borrow()
returns a Ref<T>
) that can be used to access the content.C<T>
can be sent to other threads (impl Send
), and the content of C<T>
can be accessed by multiple threads (impl Sync
).C<T>
maintains a counter that knows how many CRef<T>
exists. If C<T>
itself is dropped while any CRef<T>
still exists, it reports error.In this project, the VM creates an MMTK instance which is contained in a C<MMTK>
. A Mutator
can hold a CRef<MMTK>
to access MMTK safely. If the client attempts to drop the C<MMTK>
when any mutator is still alive, it reports an error.
However, I haven't found a suitable C
.
Box<T>
satisfies (1) that it is uniquely owned. But it only satisfies (1). It can be borrowed, but only statically in a scoped fashion.
Arc<T>
does not satisfy (1) because there is no unique owner. All Arc<T>
instances are equal. If any Arc<T>
is still alive, the content is alive. But we want to immediately kill the MMTK
instance when the VM calls destroy_mmtk
, not when the last mutator that holds an Arc<MMTK>
exits. Using Weak<T>
with Arc<T>
overkills.
The standard std::cell::RefCell
is not thread-safe. There is a atomic_refcell crate that provides the thread-safe AtomicRefCell
. It satisfies (1) and (2), but not (3) or (4). Both RefCell
and AtomicRefCell
are intended for internal mutability, not lifetime management. It needs to be contained in another container that can be shared between threads (such as Arc<AtomicRefCell<T>>
). Like RefCell
, the AtomicRef<T>
returned by AtomicRefCell<T>::borrow()
can only be used in its scope, i.e. we cannot create an AtomicRef<T>
and send it to another thread. Like RefCell
, AtomicRefCell<T>
does not check if any AtomicRef<T>
still exists when AtomicRefCell<T>
is dropped.
It remains an open question what C
we should use.
For your example code, Rust can static check it: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=b0b44fab68cb343b8978f5eb2508b92f
@qinsoon Yes. This is helpful if the VM itself is written in Rust. But what if the main function in your snippet is written in C++? Of course we can use unsafe
in the VM binding, and cast the MMTK*
pointer to a reference with an assumed lifetime. See this modified code:
The code above shows a VM calling into MMTK via a C-style binding. This time, if the VM calls the API functions in the wrong order, it still compiles, but will have runtime error. The code above prints a garbage value, and Valgrind can catch the use-after-free bug.
Using unsafe
in VM bindings is not wrong. If the VM calls the function in the wrong order, it is the VM's fault after all. But I still think some dynamic checking can be helpful. Instead of invoking undefined behaviour (hopefully crash, but may stay quiet for a long time before it suddenly starts behaving mysteriously) when the c_destroy_mmtk
function in the above example is called, it should give an error message like:
thread 'main' panicked at 'MMTK is destoryed when there are still 5 Mutators pointing to it', path/to/source_code.rs:30:5
stack backtrace:
...
Our current VMBinding
methods are all static methods, such as ActivePlan::number_of_mutators()
. I think we need to turn them into instance methods to support multiple instances, such as ActivePlan::number_of_mutators(&self)
, and add a field like vm: VM
to MMTK<VM>
.
Set the priority to low. We should raise the priority if we work on any VM that needs this.
Though we have done some refactoring to allow multiple instances, we currently have no use case of multiple MMTk instances, and it is unclear that we are missing to properly support multiple instances. Porting to V8 could be a good chance to figure this out.