mmtk / mmtk-core

Memory Management ToolKit
https://www.mmtk.io
Other
380 stars 69 forks source link

Proposal: `Edge::update`, a single method to read-modify-write a slot (`Edge`) #1033

Open wks opened 1 year ago

wks commented 1 year ago

Note: Ruby currently doesn't require this change in order to work with slot (Edge) enqueuing. But V8 will need this change to efficiently forward references while preserving tags.

Proposal

I propose a new method for the Edge trait: Edge::update.

pub trait Edge: Copy + Send + Debug + PartialEq + Eq + Hash {
    fn update<F>(&self, updater: F)
    where F: FnOnce(ObjectReference) -> Option<ObjectReference>;
}

The semantics of Edge::update is:

The updater is usually implemented by ProcessEdgesWork::process_edge to call trace_object and forward the slot.

    fn process_edge(&mut self, slot: EdgeOf<Self>) {
        slot.update(|object| {
            debug_assert!(!object.is_null()); // If the updater is called, it is guaranteed not to be null.
            let new_object = self.trace_object(object);
            if Self::OVERWRITE_REFERENCE {
                Some(new_object) // Let the VM binding update the slot
            } else {
                None // Do not update the slot
            }
        });
    }

Rationale

Supporting tagged union of references and values

https://github.com/mmtk/mmtk-core/issues/626 described the need to support slots that hold non-ref values in addition to null (such as tagged values including small integers). By letting Edge::update decide whether to call the updater, the VM binding will have a chance to decode the tagged pointer, and choose not to call the updater (which calls trace_object) if the slot holds a small integer or special non-reference values such as true, false, nil, etc.

impl Edge for RubyEdge {
    fn update<F>(&self, updater: F)
    where F: FnOnce(ObjectReference) -> Option<ObjectReference> {
        let value = self.addr.load::<VALUE>();
        if !value.is_special_const() { // If it is not special values like small integers, true, false, nil, etc.
            if let Some(new_object) = updater(value.to_ref()) { // Call updater
                self.addr.store(new_object); // Update the slot
            }
        }
    }
}

Supporting object references with tags.

Ruby never store object references together with tags. If a slot holds an object reference, its last three bits are zero, making the whole word a valid object reference.

If a VM stores object reference together with a tag, then the VM needs to preserve the tag while updating the reference. Edge::update allows the VM binding to preserve the tag during the call.

impl Edge for SomeVMEdge {
    fn update<F>(&self, updater: F)
    where F: FnOnce(ObjectReference) -> Option<ObjectReference> {
        let value = self.addr.load::<VALUE>();
        let (tag, objref) = decode_tagged_value(value); // Decode the tagged pointer

        if let Some(new_object) = updater(value.to_ref()) {
            let new_value = tag | new_object.as_usize(); // Re-apply the tag.
            self.addr.store(new_value);
        }
    }
}

If the Edge trait only has the load and store , it will be sub-optimal. The load() method can remove the tag and give mmtk-core only the object reference. But the store() method will have to load from the slot again to retrieve the tag.

impl Edge for SomeVMEdge {
    fn load(&self) -> ObjectReference {
        let value = self.addr.load::<VALUE>();
        let (_tag, objref) = decode_tagged_value(value); // Decode, but discard the tag.
        objref // Only return the object reference
    }
    fn store(&self, new_object: ObjectReference) {
        let old_value = self.addr.load::<VALUE>(); // Have to load again.
        let (old_tag, _old_objref) = decode_tagged_value(old_value); // Have to decode again.
        let new_value = old_tag | new_object.as_usize(); // Re-apply the tag.
        self.addr.store(new_value);
    }
}

Supporting slots with an offset

Similar to slots with a tag, the update method can re-apply the offset when storing. However, unlike references with tags, because the offset is usually known when scanning the object, it does not need to load from the slot again even if we are using load() and store() directly.

impl Edge for SomeVMEdge {
    fn update<F>(&self, updater: F)
    where F: FnOnce(ObjectReference) -> Option<ObjectReference> {
        let offsetted = self.addr.load::<usize>();
        let objref = offsetted - self.offset; // Compute the actual ObjectReference
        if let Some(new_object) = updater(objref) {
            let new_offsetted = new_object + self.offset; // Re-apply the offset.
            self.addr.store(new_offsetted);
        }
    }
}

When do we need it?

The load() and store() method is currently enough to support Ruby.

V8 may have a problem because according to https://v8.dev/blog/pointer-compression, if a slot holds a reference, the lowest bit will be 1, and the second lowest bit will indicate whether the reference is strong or weak. Currently the v8-support branch of mmtk/mmtk-core is hacked so that ProcessEdgesWork::process_edge remvoes the tag before calling trace_object. This makes the mmtk-core specific to v8. See: https://github.com/mmtk/mmtk-core/blob/dc62d10625e5fc2e65f61d1469fc9a659af7d0d7/src/scheduler/gc_work.rs#L459-L469

    #[inline]
    fn process_edge(&mut self, slot: Address) {
        let object = unsafe { slot.load::<ObjectReference>() };
        let tag = object.to_address().as_usize() & 0b11usize;
        let object_untagged = unsafe {
            Address::from_usize(object.to_address().as_usize() & !0b11usize).to_object_reference()
        };
        let new_object = self.trace_object(object_untagged);
        if Self::OVERWRITE_REFERENCE {
            unsafe { slot.store((new_object.to_address().as_usize() & !0b11) | tag) };
        }
    }
wks commented 1 year ago

The LXR branch added several call sites of Edge::load() and Edge::store(). They are for updating slots, too, and can be adapted to the update method, too.

There is one use case in the LXR branch that used compare_exchange to update the slot.

    fn process_remset_edge(&mut self, slot: EdgeOf<Self>, i: usize) {
        let object = slot.load();
        if object != self.refs[i] {
            return;
        }
        let new_object = self.trace_object(object);
        if Self::OVERWRITE_REFERENCE && new_object != object && !new_object.is_null() {
            if slot.to_address().is_mapped() {
                debug_assert!(self.remset_recorded_edges);
                // Don't do the store if the original is already overwritten
                let _ =
                    slot.compare_exchange(object, new_object, Ordering::SeqCst, Ordering::SeqCst);
            } else {
                slot.store(new_object);
            }
        }
        super::record_edge_for_validation(slot, new_object);
    }

It indicates that another thread may be mutating the edge at the same time. That may indicate that we need an atomic variant of the update method, with the option to skip the update if another thread updated the same slot concurrently.