Open conorsch opened 2 months ago
There's a spike on an overhaul of the smoke-test logic here https://github.com/penumbra-zone/penumbra/pull/4324 which provides a nice shape to extend into migration-testing.
We'll need to be careful about endpoint compatibility: we cannot run an older version of pd and run the most recent smoke tests against it, because the view server implementations will not be compatible.
Can you elaborate on this a little? Isn't our expectation that clients should work across upgrade boundaries?
Can you elaborate on this a little? Isn't our expectation that clients should work across upgrade boundaries?
If you try to run a client from current main against a public testnet endpoint, you'll see in incompatibility message related to ongoing auction work:
❯ git rev-parse HEAD
7854a5fc561e2e3f514421c3ea97c80cea5a673e
❯ cargo run -q --release --bin pcli -- view sync
Error: proto response missing auction params
Those same dependencies carry over into the integration tests.
Pushed a draft PR with a spike on local testing of migration logic, that can be promoted to a CI job once it's solid. Got surprised by a proto incompat error that may be spurious, so I'm going to run through the upgrade process manually to sanity-check that the scripting order is sound.
@conorsch https://github.com/penumbra-zone/penumbra/pull/4339 will simplify things a great deal
Can you elaborate on this a little? Isn't our expectation that clients should work across upgrade boundaries?
If you try to run a client from current main against a public testnet endpoint, you'll see in incompatibility message related to ongoing auction work [...]
Those same dependencies carry over into the integration tests.
Got it. I was assuming we would run the smoke test script from the original tag and then, post-upgrade, run the smoke test script from the new HEAD.
Made substantial progress on this front. There are notably two types of testing going on:
The former is potentially suitable for per-PR runs, although so far the runtime is quite long: ~20m or so. We recently shaved a ton of per-PR CI runtime off with #4324, so it'd be a shame to knock it back up again, but for assurance it'd be worth it. This type of testing is great for catching problems like #4430.
The latter case is more intensive, and isn't yet end-to-end automated yet. Given that its setup reuses the same architecture as the public testnet, it's able to catch more subtle bugs, like #4443. For now, I'll continue to use this setup as part of pre-release QA.
Is your feature request related to a problem? Please describe. When preparing a chain upgrade, manual testing of upgrades is an arduous process.
Describe the solution you'd like We should have an integration test that runs a devnet based on the currently-active testnet, via the most recently released tag, runs smoke tests against it to generate txs, then stops the network, runs the migration, restarts the network, and reruns the smoket ests.
Describe alternatives you've considered Alternatives are manual testing, which is both slow and error-prone. Longer-term we want a capable "sudo mode" for testing upgrades, which is tracked in #4265.
Additional context We'll need to be careful about endpoint compatibility: we cannot run an older version of
pd
and run the most recent smoke tests against it, because the view server implementations will not be compatible. We can sidestep this by running the smoke tests from the tagged release. Ideally, we'd be able to swap out the path to binaries within the tests via an env var to make it a bit more to test prior versions, without rebuilding from source every time.