EspressoSystems / HotShot

http://hotshot.docs.espressosys.com/
101 stars 25 forks source link

[CX-Marketplace] - Develop Mock Solver API for Testing Suite #3412

Open jparr721 opened 3 days ago

jparr721 commented 3 days ago

Closes #3373

This PR:

HotShot needs an API to mimic the behavior of the solver in a controllable way. Specifically, we need to be able to construct payloads that can simulate what we might get from the Solver, and also construct special payloads that may fail, or be too large, or not show up at all. This will enable us to have more resilient testing across a number of situations.

This PR introduces such a construct, and divines a way to build this type in a way which enables it to be used in logical tests, as well as integration tests. Integration tests will call the endpoint, whereas logical tests can construct any payload they'd like to guarantee code paths are hit:

/// The test auction results type is used to mimic the results from the Solver.
#[derive(Clone, Debug, Default)]
pub struct TestAuctionResultsProvider {
    /// We intentionally allow for the results to be pre-cooked for the unit test to gurantee a
    /// particular outcome is met.
    pub solver_results: Vec<TestAuctionResult>,

    /// A canned type to ensure that an error is thrown in absence of a true fault-injectible
    /// system for logical tests. This will guarantee that `fetch_auction_result` always throws an
    /// error.
    pub should_return_err: bool,

    /// The broadcast URL that the solver is running on. This type allows for the url to be
    /// optional, where `None` means to just return whatever `solver_results` contains, and `Some`
    /// means that we have a `FakeSolver` instance available to query.
    pub broadcast_url: Option<Url>,
}

#[async_trait]
impl<TYPES: NodeType> AuctionResultsProvider<TYPES> for TestAuctionResultsProvider {
    type AuctionResult = TestAuctionResult;

    /// Mock fetching the auction results, with optional error injection to simulate failure cases
    /// in the solver.
    async fn fetch_auction_result(
        &self,
        view_number: TYPES::Time,
    ) -> Result<Vec<Self::AuctionResult>> {
        if let Some(url) = &self.broadcast_url {
            let resp =
                reqwest::get(url.join(&format!("/v0/api/auction_results/{}", *view_number))?)
                    .await?
                    .json::<Vec<TestAuctionResult>>()
                    .await?;

            Ok(resp)
        } else {
            if self.should_return_err {
                bail!("Something went wrong")
            }

            // Otherwise, return our pre-made results
            Ok(self.solver_results.clone())
        }
    }
}

Since HotShot communicates via the trait, we can obfuscate the behavior however we'd like, we define the custom server type to ensure that integration tests can handle genuine results instead of just the happy path. This is accomplished via the Fake Solver api

    /// If a random fault event happens, what fault should we send?
    #[must_use]
    pub fn should_fault(&self) -> Option<FakeSolverFaultType> {
        if rand::random::<f32>() < self.error_pct {
            // Spin a random number over the fault types
            if rand::random::<f32>() < 0.5 {
                return Some(FakeSolverFaultType::InternalServerFault);
            }

            return Some(FakeSolverFaultType::TimeoutFault);
        }

        None
    }

    /// Dumps back the builders with non deterministic error if the `error_pct` field
    /// is nonzero.
    ///
    /// # Errors
    /// Returns an error if the `should_fault` method is `Some`.
    pub fn dump_builders(&self) -> Result<Vec<TestAuctionResult>, ServerError> {
        if let Some(fault) = self.should_fault() {
            match fault {
                FakeSolverFaultType::InternalServerFault => {
                    return Err(ServerError {
                        status: tide_disco::StatusCode::INTERNAL_SERVER_ERROR,
                        message: "Internal Server Error".to_string(),
                    });
                }
                FakeSolverFaultType::TimeoutFault => {
                    // Sleep for the preconfigured 1 second timeout interval
                    thread::sleep(SOLVER_MAX_TIMEOUT_S);
                }
            }
        }

        // Now just send the builder urls
        Ok(self
            .available_builders
            .iter()
            .map(|url| TestAuctionResult { url: url.clone() })
            .collect())
    }

Here we define some arbitrary fault rate, and when it's hit, we fault between a timeout (which would exceed the HotShot timeout limit), or we just hard crash. Both of these are considered feasible code paths for errors, but we handle errors in a blanket way that simply returns some type of anyhow::Error, which signifies to HotShot to simply move on and propose an empty block as no Solver result is available at the view being requested.

This PR does not:

Key places to review: