Incremental cycle GC - Githubissues

The cycle GC takes about 10% of the runtime on our benchmark suite. This is already far too high, and with NoGIL it will get much worse.

Currently, we perform an entire heap scan every N collections (currently N == 100). Scanning the entire heap is bad for throughput and even worse for latency.

So instead of scanning the entire heap at once, we should scan the old generation incrementally.

Here's how it could work (this untested and unproven)

Two generations: young and old.
Split the old generations into two spaces: pending and scanned.
Perform alternating young and old collections*
Survivors of the young collection are moved to the end of start of the scanned section

The old collection works as follows:

objects_per_collection = 10_000 # This will need to adapt to rate at which objects survive the young collection
if not pending:
    pending, scanned = scanned, pending
if not pending:
    return
work = [ pending.popleft() ]
objects_per_collection -= 1
while True:
    for obj in work:  # Do not check for mutation, as we want to mutate work.
        for child in obj.tp_visit:
            if child in scanned:
                continue
            DECREF(child)
            if child not in work:
                work.append(child)
                objects_per_collection -= 1
    if not pending or objects_per_collection <= 0:
        break
    work.append(pending.popleft())
    objects_per_collection -= 1
reachable = clear_cycles_and_restore_refcounts(work)
scanned.extend(reachable)

clear_cycles_and_restore_refcounts(work) is already part of the cycle GC, so we aren't really adding much complexity.

The above algorithm has (I believe) two key properties:

It is guaranteed to collect all cycles enventually
It visits no more objects than the current algorithm: In the worse case of the entire object graph being reachable from the first object looked at, then all objects are visited, but the current old space collection does that anyway.

[*] Instead of doing an incremental collection every 2 collections, we could do it every N collections. If the incremental collections visit too many objects, i.e. much more than work_per_collection, we can increase N.

faster-cpython / ideas

Incremental cycle GC #613

Here's how it could work (this untested and unproven)