Open mhofman opened 4 months ago
We have a lot of checks to guard against agoric-sdk mismatches already. This is a nice-to-have but is probably only going to catch seriously weird things, so we're assigning a low priority.
It might have overlap with the worker-v1 ideas, where the tagged/released/NPM-uploaded versions of the supervisor package should have a 1:1 relationship with the bundle contents+hash.
What is the Problem Being Solved?
When upgrading chain software, validators re-install and rebuild the SDK. If anything doesn't happen as expected, we should prevent the chain software from starting instead of failing later. See https://github.com/Agoric/agoric-sdk/issues/8471
One place where we fail this early check is with supervisor and lockdown bundles. The bundles are only sampled from source when a vat is created or upgraded, which results in their insertion into the DB. That insertion (and the resulting vat heap snapshot) will cause a divergence of the state if the bundle is not the one that is expected. This divergence happens after a commit and is very expensive to recover from (restore from previous snapshot)
Description of the Design
Include the expected hash of the bundle in the source tree, and have a check to verify that the current bundle's hash match the expected value on chain software start.
This would result in a package version bump for every transitive dependency change that affects the bundle, which is a benefit (hash change without version change is somewhat surprising).
It would make dependency changes of agoric-sdk more costly as it's one more thing in the source that needs to be updated (we already have some test output snapshots that change in these cases). This can be addressed with maintainers instructions and scripts.
Security Considerations
None
Scaling Considerations
None
Test Plan
While existing integration tests would trigger this new check, we need a new targeted test that verifies the built bundle matches the hash in source control, in order to raise a clear and early error when the hash gets out of sync.
Upgrade Considerations
New check to prevent misapplied upgrade.