Implements a heap-local candidate (a.k.a. suspect) set of objects that might contain a down-pointer. Candidates have a bit in their header which is marked by the write-barrier when a down-pointer is created. This is then used to accelerate the read-barrier for entanglement detection:
Read barrier fast path: if the object is not a candidate, then skip the entanglement check.
Read barrier slow path: if the object is a candidate, do a full entanglement check (requiring a call into the runtime and an SP maintenance query, i.e. computation graph query).
The fast path is supported by the compiler: in ssa2-to-rssa, we generate the code for the fast path, avoiding a runtime call in the case of a non-candidate.
Whenever a heap becomes a leaf, we clear the candidates within that heap (because at this point, those objects are guaranteed to no longer have down-pointers).
Performance Improvement
The fast path works incredibly well. Here are some results from our recent experiments, measuring the performance improvement due to the fast path.
Running time improvements are as much as 4x at scale.
Notice also:
Number of graph queries (full entanglement checks) reduced to 0 in most cases.
Number of candidate marks is small.
Number of candidate marks = number of times an object was marked as a candidate by the write barrier.
This is an upper bound on the number of distinct candidate objects. (The same object could be marked multiple times.)
Takeaway: in disentangled programs, down-pointers are either rare or highly consolidated (e.g. a single object could be responsible for all down-pointers).
Overall performance
Due to fast path improvements, the overall cost of entanglement detection is now essentially zero. Space overhead appears to be negligible across the board. In terms of time overhead, we measured approximately 1% on average across 23 benchmarks, and a max of 7%. A majority of benchmarks (18 out of 23) incur less than 2% time overhead.
Implements a heap-local candidate (a.k.a. suspect) set of objects that might contain a down-pointer. Candidates have a bit in their header which is marked by the write-barrier when a down-pointer is created. This is then used to accelerate the read-barrier for entanglement detection:
The fast path is supported by the compiler: in
ssa2-to-rssa
, we generate the code for the fast path, avoiding a runtime call in the case of a non-candidate.Whenever a heap becomes a leaf, we clear the candidates within that heap (because at this point, those objects are guaranteed to no longer have down-pointers).
Performance Improvement
The fast path works incredibly well. Here are some results from our recent experiments, measuring the performance improvement due to the fast path.
Running time improvements are as much as 4x at scale.
Notice also:
Overall performance
Due to fast path improvements, the overall cost of entanglement detection is now essentially zero. Space overhead appears to be negligible across the board. In terms of time overhead, we measured approximately 1% on average across 23 benchmarks, and a max of 7%. A majority of benchmarks (18 out of 23) incur less than 2% time overhead.