EspressoSystems / espresso-sequencer

104 stars 69 forks source link

Attempt to improve runtime of slow tests #1865

Open tbro opened 3 months ago

tbro commented 3 months ago

Review the tests identified as slow, and see if there is a way to lower their runtime.

You can see which ones are slow in CI. For example: https://github.com/EspressoSystems/espresso-sequencer/actions/runs/10387105099/job/28759858781

alxiong commented 1 month ago

A related discussion on zulip to improve the readability and reliability of slow_dev_node_multiple_lc_providers_test() function:

Suggested change

I'd update this part of the logic with heavy code comment and modify it like this:

   for AltChainInfo {
                provider_url,
                light_client_address,
                chain_id,
                ..
            } in dev_info.alt_chains
            {
                tracing::info!("checking hotshot commitment for {chain_id}");

                let signer = init_signer(&provider_url, TEST_MNEMONIC, 0).await.unwrap();
                let light_client = LightClient::new(light_client_address, Arc::new(signer.clone()));

                // Light client prover are running and updating the `newFinalizedState()` in the light client contract
                // the next call ensure those updates are accessible in a sliding window of historical HotShot blocks
                while light_client
                    .get_hot_shot_commitment(U256::from(1))
                    .call()
                    .await
                    .is_err()
                {
                    tracing::info!("waiting for commitment");
                    sleep(Duration::from_secs(3)).await;
                }

                let liveness_failure_height = signer.get_block_number().await.unwrap().as_u64();
                let (_, l1_height_of_last_hotshot_block) = light_client
                    .state_history_commitments(light_client.get_state_history_count().await? - 1)
                    .await?;
                // *Simulate* a hotshot liveness failure: by toggling the flag in mock light client contract;
                // under the hood, both L1 and Hotshot are progressing: `stateHistoryCommitments` in the contract
                // is appended with new Hotshot block commitment and new L1 block height;
                // BUT, `lag_over_escape_hatch_threshold()` will compute against the frozen `l1_height_of_last_hotshot_block`
                dev_node_client
                    .post::<()>("api/set-hotshot-down")
                    .body_json(&SetHotshotDownReqBody {
                        chain_id: Some(chain_id),
                        height: liveness_failure_height,
                    })
                    .unwrap()
                    .send()
                    .await
                    .unwrap();
                // sanity check
                assert!(liveness_failure_height >= l1_height_of_last_hotshot_block);
                assert!(
                    !light_client
                        .lag_over_escape_hatch_threshold(
                            U256::from(liveness_failure_height + 1),
                            U256::from(
                                liveness_failure_height - l1_height_of_last_hotshot_block
                            ),
                        )
                        .call()
                        .await?
                );
                assert!(
                    light_client
                        .lag_over_escape_hatch_threshold(
                            U256::from(liveness_failure_height),
                            U256::from(liveness_failure_height - l1_height_of_last_hotshot_block),
                        )
                        .call()
                        .await?
                );

                // to detect hotshot is down, we test that L1 made progress, but light client contract didn't
                // the while-loop condition will evaluate to false when L1 height increase beyond `liveness_failure_height`
//TODO: maybe send dummy tx to artificially increase L1 block here; otherwise we are waiting for light client prover
// to generate proofs which takes 2 min in CI.
                while !light_client
                    .lag_over_escape_hatch_threshold(
                        U256::from(signer.get_block_number().await?), // current L1 block height
                        U256::from(liveness_failure_height - l1_height_of_last_hotshot_block),
                    )
                    .call()
                    .await
                    .unwrap_or(false)
                {
                    tracing::info!("waiting for setting hotshot down");
                    sleep(Duration::from_secs(3)).await;
                }

                // *Simulate* Hotshot regaining liveness by toggling the flag in mocked light client contract.
                // During all steps above, the newFinalizedState are being updated due to the simulation.
                dev_node_client
                    .post::<()>("api/set-hotshot-up")
                    .body_json(&SetHotshotUpReqBody { chain_id })
                    .unwrap()
                    .send()
                    .await
                    .unwrap();

                // Detect hotshot restoring liveness by ensuring the gap between `liveness_failure_height` and
                // (the back online, thus updated) `l1_height_of_last_hotshot_block` decreased.
                // Note that in the sanity check above when we shutdown hotshot, the same statement in while-loop condition equals true;
                // only when light client got new updates, will `lagOver()` returns false
                while light_client
                    .lag_over_escape_hatch_threshold(
                        U256::from(liveness_failure_height),
                        U256::from(liveness_failure_height - l1_height_of_last_hotshot_block),
                    )
                    .call()
                    .await
                    .unwrap_or(true)
                {
                    tracing::info!("waiting for setting hotshot up");
                    sleep(Duration::from_secs(3)).await;
                }
            }

then modify LightClientMock.sol :

function setHotShotDown() public {
    hotShotDown = true;
    frozenL1Height = stateHistoryCommitments[stateHistoryCommitments.length - 1].l1BlockHeight;
}

^^ I have made minor changes to the suggested snippet based on @alysiahuggins's point on a potential underflow in my original post in zulip. But again, this code has not been tested, might need some small tweaking at least

cc @imabdulbasit @ImJeremyHe