Open rowgraus opened 2 years ago
@dtribble Is this needed for Mainnet-1?
@dtribble is this a MN-1 thing?
@warner For proper project planning and tracking, this needs an area label covered by one of our weekly planning meetings. Please pick the appropriate one from: agd, agoric-cli, agoric-cosmos, amm, core economy, cosmic-swingset, endo, ertp, getrun, governance, installation-bundling, metering, oracle, pegasus, run-protocol, ses, staking, swingset, swingset-runner, tc39, token economy, tooling, ui, wallet, xsnap, zoe, zoe contract
next step: have a meeting to figure out a design
@rowgraus can we make this a regular issue? I believe I don't have the rights to do that.
@mhofman I was able to convert it to a regular issue.
We discussed this in the kernel meeting today. The summary of the discussion:
On a start after halt scenario, we'll need the swingset low-priority handling to start disabled.
We'll also likely want any low priority messages to be rejected at the cosmos layer until the kernel is ready to process low priority messages again, arguing for an explicit version of the lever for that mechanism.
An economic contract which relies on price oracles could decide they're ready once they've received a second price update (after it acknowledged the first update which may have been stale, and then oracle sent a now up-to-date quote).
@mhofman Can we remove the in-design label from this one?
What this means to me is:
The fix we talked about was to do a "soft restart", in which the chain is told that it is restarting, and for the first minute or so, it does not accept any messages other than economy-critical price-oracle signals.
We'd implement this with the #5334 backpressure mechanism which controls ingress at the mempool/txn level to exclude non-oracle-signed transactions from blocks during the restart window, plus some code in the new version that knows when this window starts and ends. If the chain halted just after block 100, such that the next block executed will be 101, then our replacement/upgraded software should have something in
cosmic-swingset
that does:to give roughly 60 seconds for the economic engine to get prepared for user requests. We'd also need to ensure that the oracle price signals / etc can be delivered during that window, even if user requests are flooding the RPC servers/etc.
We might consider making this more explicit: let the vats that manage vaults give a signal when they believe they're up to date, and disable non-economic messages until that point. That might mean control over the non-economic admissibility should be made available to userspace, which would be.. exciting. It would also want a way for the
cosmic-swingset
layer to signal to those economy vats that we'd entered soft-start mode, and that the vats are responsible for exiting it when they're ready.@rowgraus points out that this feature could easily consume more effort than it warrants, and/or could expose more of an attack surface than it addresses, and I agree. I think we'll need to invoke our economist friends for advice too.