controller.reapAllVats() could reset reapCountdown

warner commented 10 months ago

What is the Problem Being Solved?

Vats are normally GCed (dispatch.bringOutYourDead()) by a per-vat timer named reapCountdown. This is initialized to reapInterval (set to 1000 on mainnet), and decremented for each delivery to the vat. When it hits zero, a BOYD is performed, and the counter is reset.

We recently landed controller.reapAllVats(), which schedules an immediate (well, at the start of the next controller.run() BOYD to all vats. However this doesn't change the "organic" BOYD timing: reapCountdown is not affected, so the vat will still do the usual BOYD when it hits zero.

For purposes of our benchmarks, we'd like to reduce the variation caused by organically-scheduled BOYDs. We can do a reapAllVats() just before starting the benchmark, and again just afterwards (so the object counts are accurate), but it would be less noisy if we could reduce the frequency of BOYDs in the middle. We should consider configuring the benchmark environment to use defaultReapInterval: 'never', however it would also make a certain amount of sense if reapAllVats() were to reset the counters back to reapInterval.

(Now I'm on the fence about whether this is a good idea.. using defaultReapInterval: 'never' seems best, and if we use that, then it doesn't matter whether we reset the counters. But I'll capture the idea for further discussion).

Description of the Design

If implemented, we would:

add vatKeeper.resetReapCountdown()
change kernel.reapAllVats() to call resetReapCountdown just after enqueuing the BOYD

Security Considerations

none

Scaling Considerations

none

Test Plan

Update the unit test to inspect the kvStore after calling reapAllVats(), to assert that the counters have all been reset.

Upgrade Considerations

none

cc @FUDCo

warner commented 10 months ago

It might also be useful to change reapAllVats to look at this countdown, and skip reaping the vats which haven't received any deliveries since the last one. We might then change the name to reapAllDirtyVats.

In my ghost-replay testing, I'm doing a cleanup pass before submitting the new operation (so I can tell what DB changes are a result of my injected action, versus GC consequences that were just waiting around for a BOYD), and I've found that I need to make two reapAllVats() calls to clean everything up. This is the usual "one hop per BOYD" issue, where vatA gets reaped before vatB, but vatB releases some things that were coming from vatA, and you need a BOYD on vatA to let the cleanup process terminate. (The worst case would be a linked list that travels through multiple vats: you could only free one link per reapAllVats()). Each reapAllVats() must reload all vats to delivery the BOYD, which requires a transcript replay, and the vat-warehouse LRU cache gets thrashed, now that we have more vats than its maxVats size.

Making reapAllVats() free for untouched vats would make it a lot cheaper to do several of them, both avoiding the replay and not knocking out the still-active vats.

warner commented 10 months ago

We might even want this to also trigger a heap snapshot for all vats (which, really, would mean removing the enqueue-BOYD step and just follow the "time to make a heap snapshot" path, since that path also does a BOYD). While testing large GC operations, I'm noticing a significant amount of time being spent replaying a vat which just completed a large BOYD. The unfortunate LRU-thrashing property of reapAllVats, combined with needing to do multiple ones (to handle multiple hops through the object graph), means:

we start the first reapAllVats()
- this triggers a large BOYD in v9, which is duly executed and added to the transcript
- we BOYD all the other vats, which knocks v9 out of the active set
we start the second reapAllVats()
- when this reaches v9 (to do the second BOYD), we have to page v9 back in
- so we re-execute the previous large BOYD before we can new-execute the second one

This results in duplicate work, some of which is super-expensive (millions of syscalls, 10 minutes of wallclock).

Of course, this would also be handled elegantly by @mhofman 's suggestion to schedule heap snapshots based on computrons, rather than simply counting deliveries. The large BOYD would trigger a snapshot. #6786

We should also change the snapshot trigger to look at (or remember) the last delivery, and refrain from a new BOYD if that was the last thing we did in that vat.

FUDCo commented 10 months ago

It might also be useful to change reapAllVats to look at this countdown, and skip reaping the vats which haven't received any deliveries since the last one. We might then change the name to reapAllDirtyVats.

This makes a lot of sense to me. I don't think we need to rename the operation, we just note that reaping vats that haven't had any activity in them since the last time goes really fast.

FUDCo commented 10 months ago

The large BOYD would trigger a snapshot

We'd have this weird situation where BOYD triggers a snapshot and snapshot triggers a BOYD, so it seems like it might mainly be a question of which goes first. Perhaps the alternate route is to reframe reapAllVats to directly initiate a snapshot rather than a BOYD, though it seems weird to initiate snapshots on vats that are currently swapped out because they got swapped out in a post-BOYD state to begin with. The observation is that it's the BOYD of other vats that causes stuff to be freed, in which world swapping a vat in just to immediate swap it out again wouldn't necessarily be so crazy, since if significant GC happens the second snapshot could be smaller.

The other thing that might be worth considering is having the BOYD-them-all logic to somehow be in cahoots with the cache, perhaps by using knowledge of what's currently swapped out vs. what's swapped in to tweak the order of sweeps so they don't interact with the cache so pathologically. I have memories of Alan Karp talking about big numerical calculations at IBM where MRU instead of LRU gave optimal cache benefit.

mhofman commented 10 months ago

It might also be useful to change reapAllVats to look at this countdown, and skip reaping the vats which haven't received any deliveries since the last one. We might then change the name to reapAllDirtyVats.

Yup I suggested something similar in https://github.com/Agoric/agoric-sdk/issues/8626#issuecomment-1847819729, and on slack but using the vatPos instead. reapCountdown would be a lot cleaner.

this would also be handled elegantly by @mhofman 's suggestion to schedule heap snapshots based on computrons, rather than simply counting deliveries. The large BOYD would trigger a snapshot. #6786

We should also change the snapshot trigger to look at (or remember) the last delivery, and refrain from a new BOYD if that was the last thing we did in that vat.

As I mention in that issue, I think we should only run BOYD and snapshotting together, triggered by the computrons of previous deliveries. As such the schedule would not count the cost of BOYD and snapshotting towards the interval, but I think that's fine. If anything, a costly BOYD should increase the interval at which you perform it (which could be captured as adding the computron cost of the last BYOD to the "next reap meter" instead of decreasing from it)

it seems weird to initiate snapshots on vats that are currently swapped out because they got swapped out in a post-BOYD state to begin with

I don't understand. If BOYD and snapshotting is an atomic operation, how can that happen?

Perhaps the alternate route is to reframe reapAllVats to directly initiate a snapshot rather than a BOYD

That doesn't seem right. Snapshotting is conceptually an internal detail of some types of workers, where BOYD is a concept applying to all vats. If anything snapshots should be an implicit effect of BOYD for those workers, and never explicitly performed by the controller.

Agoric / agoric-sdk