Closed evan-forbes closed 3 months ago
the goal of this test is to ensure that whatever branch is capable of syncing from scratch. In theory, we can do this with just unit tests, however there are so many different hyper werid edge cases that we don't know what all of those are yet.
for example, we could execute one transaction per block, for every possible transaction type with the v1 binary, save those roots somewhere, and then compare the roots against whatever the result is of the other branch.
we could maybe also run some deterministic (same inputs to each) fuzzing script on each version, and save those results. whatever branch would have to have identical results to process proposal.
Just noting the OP proposes two distinct tests so we could split this into two issues:
IMO the first one is more valuable than the second because I expect we'll be able to run the first one in CI and make it a required check prior to PR merge. I expect the second test will take much longer to run and may only be able to run via the nightly (or weekly 😨) CI jobs. That means we'll get delayed signal via the second test. If we're committing to single binary syncs then IMO we still need the second test. But ideally the first test gives us confidence to merge PRs and the second test gives us confidence that new release candidates still preserve the single binary feature.
for example, we could execute one transaction per block, for every possible transaction type with the v1 binary, save those roots somewhere, and then compare the roots against whatever the result is of the other branch.
I really like this idea.
We discussed this in a sync and came to the conclusion that this test:
- Create a test that compares the results of processing and executing txs on the main branch for each app version.
is a nice to have.
- Create a significantly longer test that syncs mainnet from scratch using a block data archive node.
is release blocking but it can be performed manually via the mainnet.sh
script.
Evan and I paired on a prototype for the first test: https://github.com/celestiaorg/celestia-app/tree/rp/pair-with-evan
If we take the route of expanding the existing knuu infrastructure, we could run this test directly after a PR is merged instead of nightly.
Spun out https://github.com/celestiaorg/celestia-app/issues/3490 so that this issue can focus on the first test in the OP:
Create a test that compares the results of processing and executing txs on the main branch for each app version.
Notes:
Ideas:
So my take on this is that we already have a script for syncing a node from genesis. Which we should use before every release. If we want it to be faster then we can have some beefy machine we run and just spin up a new node in the same machine (to skip the p2p overhead). Or if we wanted this to be a unit test we could persists only the block and state stores and run the application over the set of transactions (kind of like a local block sync) and compare the app hashes.
The second thing I would advocate for is to add the main docker image to the TestMinorVersionCompatibility
. That way we have main (using v1) and all our other v1 tags working together. If we feel txsim is not doing a sufficient job we can expand on the amount of different messages it sends.
When we upgrade to v2, we will need to modify TestMinorVersionCompatibility to move from v1 state machine to v2 state machine for future v2 minor releases. We could create more elaborate tests but I want to review our single binary syncs approach before continuing.
Description
Write a non-determinism unit test for the state machine. The test will involve creating a set of transactions, executing them, and comparing the resulting apphash with an expected value.
Acceptance Criteria