Agoric / agoric-sdk

monorepo for the Agoric Javascript smart contract platform
Apache License 2.0
326 stars 206 forks source link

break large GC syscalls up into smaller pieces #8389

Open warner opened 12 months ago

warner commented 12 months ago

What is the Problem Being Solved?

I've been investigating some performance issues on mainnet, and one possible consequence of a fix is that a BOYD might GC something like 50,000 imports all at once. That might be more than can fit comfortably inside a single syscall.

We'll need to simulate this to measure how long it would take or what limits might be hit, but the most important one would be the netstring length limit on worker messages. That's like 10MB, and the syscall.dropImports messages will consume at most 20 bytes per entry, so we're talking about a 1MB netstring, so we'd still have a 10x safety margin.

But, I don't think I ever considered really large numbers of vrefs passing through the GC code at once. There might be some algorithmic slowdowns (the way we slice off the next GC Action to perform, in particular, might be O(N^2)). We might or might not be able to dodge these slowdowns by submitting the syscalls in smaller chunks. (Note that I think we need to submit all the vrefs by the end of the BOYD, to avoid the risk of confusion later with vrefs being re-imported, which limits this ability to dodge).

Description of the Design

In liveslots.js we currently deliver all the dropped vrefs in a single call: https://github.com/Agoric/agoric-sdk/blob/ef466d33286661655a3e296ef02f94fc95cf48ea/packages/swingset-liveslots/src/liveslots.js#L341-L343

I think we'd just break this up into groups of like 1k at a time. We'd do the same for all three calls.

Security Considerations

none, the same work is being done, just spread out over multiple calls

Scaling Considerations

need to investigate how the kernel reacts to large numbers of vrefs being GCed at the same time, see if there's a way to spread it out, and look for algorithmic slowdowns

Test Plan

Unit tests with a synthetic GC framework that creates more than 1k vrefs to drop at the same time, then examines the syscalls to make sure they are size-limited.

Probably manual tests of the kernel GC code to investigate potential perf problems with large GC operations.

Upgrade Considerations

This fix will go into liveslots, which means it will only help with vats that have been upgraded. Fortunately the fixes I'm looking at for allowing vats to drop their existing references would require a vat upgrade anyways. So the main sequencing requirement is that the kernel is re-launched (and re-bundles liveslots, with this smaller-pieces fix) before we attempt to upgrade the vats (with the change that releases those references).

mhofman commented 12 months ago

Interesting, but somewhat conflicting of my wish to move workers to a model where there is a single result at the end of the delivery containing all "distributed object operations" (once we move vatStore in process, and make device calls async only). Maybe we should consider a streaming result when we get there, or we'll simply have moved to parallelized worker already, removing any IPC costs.

warner commented 12 months ago

I think we can accommodate that, we just need to explicitly allow a large number of dropImport+retireImport items in those "distributed object operations". We're currently expressing them as syscalls, which get sent over messages, which can't get too big. We'll still be doing something with the operation results, but we need to deliver them in a container big enough to handle dropping a whole lot at once.

(maybe this will turn into a protocol where the worker knows its current references, and the worker+kernel go through some sort of interactive reconciliation process to let the kernel know what can be dropped: we might allow that process to take multiple steps, limiting the sizes of the messages it uses. Like, we might somehow incorporate both after-the-fact "you've just upgraded, what are you still using?" things, and cycle-breaking "are you only holding onto X because I'm holding onto Y?" questions, into a better protocol, and we should design that protocol to handle large changes without size-limit problems)

mhofman commented 12 months ago

That GC protocol is orthogonal and partially described in https://github.com/Agoric/agoric-sdk/issues/6793

But yeah in general I want an async "emit event" from vat to kernel that doesn't need acking from the kernel. Then the "how" these events are sent doesn't matter, whether streamed during execution or bundled up at result time.

warner commented 12 months ago

FWIW, the syscall.dropImport()/etc messages don't have responses. However the kernel must receive them promptly, and update the c-lists and refcounts before doing anything else, in particular before another crank happens (which might re-introduce the recently-deleted/decremented entries). If that part becomes async, we'll need some more sophisticated protocols that can handle overlapping changes, like the ones we have on comms (which of course cannot do synchronous processing of remote messages). One form of this is to send a decref instead of a drop, and have counters on both sides.

mhofman commented 12 months ago

by async I mean the vat doesn't need to wait for an ack from the syscall, like it shouldn't have to for any other "distributed object operations". I very much want to avoid effects of a delivery appearing outside of the delivery window,