Closed k-sareen closed 7 months ago
Again reinforcing that transitively pinning is still unsupported for the StickyImmix nursery GC (though it should be possible to support it). The main question is how to treat objects in the modbuffer.
@wks Can you try this PR out for Ruby to see if it allows objects to move during the nursery GC?
Again reinforcing that transitively pinning is still unsupported for the StickyImmix nursery GC (though it should be possible to support it). The main question is how to treat objects in the modbuffer.
@wks Can you try this PR out for Ruby to see if it allows objects to move during the nursery GC?
I confirmed that with this PR (and the suggested changes I mentioned in the last comment), Ruby is able to pin object during nursery GC, and the GC is also able to move objects during GC. I am not sure whether it works perfectly because the Ruby binding itself has not been fully tested against moving GC, yet.
Done @wks
But now with the fix, transitively pinning is actually unsupported (given the issue regarding modbuffer) but MMTk will allow a VM developer to proceed regardless. This can blowup in glorious ways.
But now with the fix, transitively pinning is actually unsupported (given the issue regarding modbuffer) but MMTk will allow a VM developer to proceed regardless. This can blowup in glorious ways.
I don't think it conflicts with transitive pinning. They are orthogonal.
The fields of the objects in the modbuf should be considered non-pinning roots (unless the VM has something like "potential pinning parents" like Ruby). If an object is added to the modbuf, it simply means the object has been modified before the current GC, and some of its children may point to young objects. Being in the modbuf doesn't mean the field of the object cannot be updated. But if the same young object is also reachable from pinning roots, the object will be pinned; if it's reachable from transitive pinning roots, it and its children will all be pinned.
I think the current implementation of ProcessModBuf
is fine:
if is_nursery_gc() {
// Scan objects in the modbuf and forward pointers
let modbuf = std::mem::take(&mut self.modbuf);
GCWork::do_work(
&mut ScanObjects::<E>::new(modbuf, false, WorkBucketStage::Closure), // Put into the Closure bucket
worker,
mmtk,
)
}
As long as the ScanObjects
work packet is put into the Closure
bucket, it will be executed after all the transitive pinning roots and their children are traced, and also after all non-transitive pinning roots are traced. It will still keep the children of objects in the modbuf alive, but whether they are pinned is determined by other (transitively pinning or non-transitively pinning) roots.
Hm. I see. It'd be better if Julia could actually test this PR out to ensure that nothing breaks since neither ART nor Ruby use the transitively pinning roots.
@qinsoon @wks The JikesRVM test failure was due to a timeout. Have we seen something like this before? I am not sure why this PR would cause a timeout.
@qinsoon @wks The JikesRVM test failure was due to a timeout. Have we seen something like this before? I am not sure why this PR would cause a timeout.
I think we have seen timeouts, but not frequently.
JikesRVM just seems like it's a flaky CI. This code doesn't even touch the GCs that JikesRVM supports. I don't know why we get the failing instruction not in RVM address error. Should I investigate further?
Idk why the ready-to-merge check timed out. I just re-ran it.
https://github.com/mmtk/mmtk-core/actions/runs/8657791364/job/23740512154 It looks like it timed out while running GenImmix. This is different from https://github.com/mmtk/mmtk-core/issues/1103 which is caused by a race condition in MarkSweep.
It may be an unknown bug.
Idk why the ready-to-merge check timed out. I just re-ran it.
@qinsoon I think the ready-to-merge check is broken when we have the binding tests. They're called "extended-tests-\
https://github.com/mmtk/mmtk-core/actions/runs/8657791364/job/23740512154 It looks like it timed out while running GenImmix. This is different from #1103 which is caused by a race condition in MarkSweep.
It may be an unknown bug.
Yikes. OpenJDK shouldn't time out since this should only change how the pinning/transitively pinning roots work. I'll investigate locally when I have the time.
OR This could be the result of the new change where we removed the coordinator. It would be ideal to be able to ssh to the machine and then inspect the deadlocked process.
OR This could be the result of the new change where we removed the coordinator. ...
Yes. This is possible. By removing the coordinator, the mechanism for triggering GC has changed. There could be some corner cases that are not handled properly, but has not been discovered, yet.
@qinsoon I'm not sure if the ready to merge check was fixed properly. It seems like it's still waiting on something
@qinsoon I'm not sure if the ready to merge check was fixed properly. It seems like it's still waiting on something
Not sure what is happening there. The problem of wrong ignored action names was fixed.
@qinsoon I'm not sure if the ready to merge check was fixed properly. It seems like it's still waiting on something
Yeah. It was actually a mistake. https://github.com/mmtk/mmtk-core/pull/1119 should fix it. The action name for OpenJDK was changed due to the reused workflow. So the merge check was actually waiting for the OpenJDk tests to finish (as it was not ignored).
This PR allows StickyImmix to properly deal with transitive pinning trace. Before this PR, StickyImmix may move objects in the transitive pinning trace, as it simply uses
GenNurseryProcessEdges
which allows moving. With this PR, for transitive pinning trace, StickyImmix usesGenNurseryProcessEdges<..., TRACE_KIND_TRANSITIVE_PIN>
. This PR fixes https://github.com/mmtk/mmtk-core/issues/1097.