It is a unit of work that processes multiple object graph edges. Examples include
Process one slot, tracing the reference in it and updating the slot.
Process a contiguous range of slots (i.e. MemorySlice), tracing and updating each slot.
Process a native object (a struct held in a malloc cell) that is associated to an in-heap object and contains object references. For example, some Ruby objects that have off-heap buffers.
Process a native object (a struct held anywhere outside the GC heap) that is not part of any heap object. Examples include the indirect reference table used by JNI, the weak reference table, finalization table, string table, and other strong or weak tables.
Process a stack, tracing and updating each root slot, except we cannot use the trait Edge (or trait Slot after this) for various reasons.
The common part is that all of them need to call trace_object to trace the edges, and create ScanObjects work packets for newly visited children.
What's not common is that the "unit" can be small or large. It can be one single slot, and it can be multiple objects to be scanned, and it can be a whole stack.
In theory, Edge (Slot) and MemorySlice are custom tracing units
Yes. But we don't want to replace them yet. They work pretty well in MMTk for now.
Representation of a custom tracing unit
As a closure
The simplest way to represent such a thing is a FnOnce(impl ObjectTracer), or a trait like this:
That is, it is a runnable thing that can contain arbitrary data as context, and it uses a ObjectTracer (which provides trace_object when running.
But the key point is that it is not given a reference to ObjectTracer until it is executed. This is important because we can only call trace_object at certain times, such as TPinningClosure, PinningRootsTrace, Closure, and *RefClosure. Importantly, we cannot call trace_object in Prepare.
As a work packet
Because it is a unit of work, we can wrap it in a work packet.
But because a "custom tracing unit" can be small, we can pack multiple such units into one packet.
In fact, "custom tracing units can be nested. One big unit can contain multiple small units. For example, we can aggregate 4096 Edge (Slot) instances into one work packet and process them in one go. (That's what our ProcessEdgesWork currently does.) We can put one whole stack into one "custom tracing unit", and it can be further split into the scanning of each stack frame.
Why is it useful?
It complements our current root-scanning mechanisms.
Currently VM bindings deliver a list of roots edges to mmtk-core as either a Vec<Edge> (Vec<Slot>) or a Vec<ObjectReference> (a list of target objects which need to be pinned). Only Edge (Slot) can be updated. That's not general enough. Some VMs, such as Ruby and Android, cannot represent some root edges as Edge (Slot). Those VMs need to access trace_object directly.
Instead, we can let the VM deliver a custom tracing unit for a subset of global roots, such as one stack. We introduce an extra method
trait RootsWorkFactory {
/// Create a work packet which will be executed in the `Closure` bucket.
/// When executed by a worker, the worker will instantiate `OT` and call `callback` with a reference to it.
/// Newly visited object from `OT` will be added to a `ScanObjects` work packet in the `Closure` bucket.
fn custom_tracing_unit<OT: ObjectTracer>(callback: impl FnOnce(&mut OT));
}
Calling custom_tracing_unitdoes not create an ObjectTracer immediately, but it creates a work packet which will be executed in Closure. When that work packet is executed, it creates an OT instance using the current ProcessEdgesWork implementation selected by the current GC, call the callback with a reference of OT, and then flush it.
For example, the Ruby VM binding can call custom_tracing_unit with a callback that calls gc_update_references. gc_update_references will call trace_object to update the roots and assign the updated object references back to the root fields.
As another example, the Andorid ART can call custom_tracing_unit with a callback that scans the stack. It uses whatever ART provides to identify reference slots on the stack and call trace_object to update them. Note that this happens in Closure. Although we usually scan stacks in Prepare, it is OK to do it in Closure because it is still enough to keep the objects pointed by root edges alive.
It helps scanning complicated objects.
Many objects in Ruby are implemented as off-heap C objects, and are scanned using functions with statements like obj->field = trace_object(obj_field). (See https://github.com/mmtk/mmtk-core/issues/710 for more details). Currently, we use Scanning::scan_object_and_trace_edges to trace all edges logically starting from one object (that includes all fields of the in-heap part of the object, and the fields in off-heap structs, too). It's problematic if an object involves many off-heap objects. That usually makes the ScanObjects work packet too large to parallelize properly.
With custom tracing units, we can offload each native struct to one separate custom tracing unit, and they can be split into multiple work packets. What we need is something similar to RootsWorkFactory::custom_tracing_unit, but callable during tracing.
Related issues
https://github.com/mmtk/mmtk-core/issues/710 raised the need of letting the VM call trace_object directly. This issue drafts one possible implementation of it. One challenge discussed in https://github.com/mmtk/mmtk-core/issues/710 is limiting the scope of trace_object so that it can only be called at the right time (from TPinningClosure to VMRefClosure). The solution in this issue does not give ObjectTracer to the VM binding directly, but only lends it to the binding when executing the work packet.
This is one way to implement https://github.com/mmtk/mmtk-core/issues/710
What is a custom tracing unit?
It is a unit of work that processes multiple object graph edges. Examples include
MemorySlice
), tracing and updating each slot.trait Edge
(ortrait Slot
after this) for various reasons.The common part is that all of them need to call
trace_object
to trace the edges, and createScanObjects
work packets for newly visited children.What's not common is that the "unit" can be small or large. It can be one single slot, and it can be multiple objects to be scanned, and it can be a whole stack.
In theory, Edge (Slot) and MemorySlice are custom tracing units
Yes. But we don't want to replace them yet. They work pretty well in MMTk for now.
Representation of a custom tracing unit
As a closure
The simplest way to represent such a thing is a
FnOnce(impl ObjectTracer)
, or a trait like this:That is, it is a runnable thing that can contain arbitrary data as context, and it uses a
ObjectTracer
(which providestrace_object
when running.But the key point is that it is not given a reference to
ObjectTracer
until it is executed. This is important because we can only calltrace_object
at certain times, such asTPinningClosure
,PinningRootsTrace
,Closure
, and*RefClosure
. Importantly, we cannot calltrace_object
inPrepare
.As a work packet
Because it is a unit of work, we can wrap it in a work packet.
But because a "custom tracing unit" can be small, we can pack multiple such units into one packet.
In fact, "custom tracing units can be nested. One big unit can contain multiple small units. For example, we can aggregate 4096
Edge
(Slot
) instances into one work packet and process them in one go. (That's what ourProcessEdgesWork
currently does.) We can put one whole stack into one "custom tracing unit", and it can be further split into the scanning of each stack frame.Why is it useful?
It complements our current root-scanning mechanisms.
Currently VM bindings deliver a list of roots edges to mmtk-core as either a
Vec<Edge>
(Vec<Slot>
) or aVec<ObjectReference>
(a list of target objects which need to be pinned). OnlyEdge
(Slot
) can be updated. That's not general enough. Some VMs, such as Ruby and Android, cannot represent some root edges asEdge
(Slot
). Those VMs need to accesstrace_object
directly.Instead, we can let the VM deliver a custom tracing unit for a subset of global roots, such as one stack. We introduce an extra method
Calling
custom_tracing_unit
does not create anObjectTracer
immediately, but it creates a work packet which will be executed inClosure
. When that work packet is executed, it creates anOT
instance using the currentProcessEdgesWork
implementation selected by the current GC, call thecallback
with a reference ofOT
, and then flush it.For example, the Ruby VM binding can call
custom_tracing_unit
with a callback that callsgc_update_references
.gc_update_references
will calltrace_object
to update the roots and assign the updated object references back to the root fields.As another example, the Andorid ART can call
custom_tracing_unit
with a callback that scans the stack. It uses whatever ART provides to identify reference slots on the stack and calltrace_object
to update them. Note that this happens inClosure
. Although we usually scan stacks inPrepare
, it is OK to do it inClosure
because it is still enough to keep the objects pointed by root edges alive.It helps scanning complicated objects.
Many objects in Ruby are implemented as off-heap C objects, and are scanned using functions with statements like
obj->field = trace_object(obj_field)
. (See https://github.com/mmtk/mmtk-core/issues/710 for more details). Currently, we useScanning::scan_object_and_trace_edges
to trace all edges logically starting from one object (that includes all fields of the in-heap part of the object, and the fields in off-heap structs, too). It's problematic if an object involves many off-heap objects. That usually makes theScanObjects
work packet too large to parallelize properly.With custom tracing units, we can offload each native struct to one separate custom tracing unit, and they can be split into multiple work packets. What we need is something similar to
RootsWorkFactory::custom_tracing_unit
, but callable during tracing.Related issues
https://github.com/mmtk/mmtk-core/issues/710 raised the need of letting the VM call
trace_object
directly. This issue drafts one possible implementation of it. One challenge discussed in https://github.com/mmtk/mmtk-core/issues/710 is limiting the scope oftrace_object
so that it can only be called at the right time (fromTPinningClosure
toVMRefClosure
). The solution in this issue does not giveObjectTracer
to the VM binding directly, but only lends it to the binding when executing the work packet.