near / near-workspaces-rs

Write tests once, run them both on NEAR TestNet and a controlled NEAR Sandbox local environment via Rust
82 stars 48 forks source link

Potential issues with `clone` and parallel execution. #359

Open ruseinov opened 3 months ago

ruseinov commented 3 months ago

I have tried parallelizing transaction execution in two different manners.

  1. By executing a set of transactions that requires cloning Account and Contract.

use near_gas::NearGas; use near_primitives_core::types::Gas; use near_workspaces::types::GasMeter; use near_workspaces::Account; use serde::{Deserialize, Serialize}; use std::collections::HashMap; use strum::IntoEnumIterator; use strum_macros::EnumIter; use tokio::join; use tokio::task::JoinSet;

[derive(Serialize, Deserialize, EnumIter, Copy, Clone)]

[serde(crate = "near_sdk::serde")]

pub enum Collection { IterableSet, IterableMap, UnorderedSet, UnorderedMap, LookupMap, LookupSet, TreeMap, Vector, }

[derive(Serialize, Deserialize)]

[serde(crate = "near_sdk::serde")]

pub enum Op { Insert(u32), Remove(u32), Flush, Contains(u32), Iter(usize), }

[tokio::test]

// TODO: Parallelize async fn combined_test() -> anyhow::Result<()> { let worker = near_workspaces::sandbox().await?; let contract = worker.dev_deploy(include_bytes!("test-contracts/store/res/store.wasm")).await?; let account: Account = worker.dev_create_account().await?;

let mut collection_set = JoinSet::new();
// insert
for col in Collection::iter() {
    let account = account.clone();
    let contract = contract.clone();
    collection_set.spawn(async move {
        let mut total_gas: u64 = 0;
        let mut futures = JoinSet::new();

        for _ in 0..=16 {
            let txn =
                account.call(contract.id(), "exec").args_json((col, Op::Insert(1))).transact();
            futures.spawn(txn);
        }
        while let Some(res) = futures.join_next().await {
            total_gas += res??.total_gas_burnt.as_gas();
        }
        Ok::<_, anyhow::Error>(total_gas)
    });
}

while let Some(total_gas) = collection_set.join_next().await {
    let total_gas = total_gas??;
    assert!(total_gas < NearGas::from_tgas(100).as_gas(), "performance regression");
    assert!(
        total_gas > NearGas::from_tgas(90).as_gas(),
        "not enough Tgas consumed, adjust the number of iterations to spot regressions"
    )
}

Ok(())

}


This breaks and yields:

Error: unable to broadcast the transaction to the network

Caused by: handler error: [An error happened during transaction execution: InvalidNonce { tx_nonce: 12000019, ak_nonce: 12000033 }]

I have a suspicion that this has to do with the fact that `account` is cloned, but it does not really support parallel calls as expected. However, since it does not require a mutable reference - it could just be too many requests per second. In any case this behaviour is not expected and needs to be investigated.

2. By executing a set of transactions without the need to clone `Account` and `Contract`. 
```rust
use near_gas::NearGas;
use near_primitives_core::types::Gas;
use near_workspaces::types::GasMeter;
use near_workspaces::Account;
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
use strum::IntoEnumIterator;
use strum_macros::EnumIter;
use tokio::join;
use tokio::task::JoinSet;

#[derive(Serialize, Deserialize, EnumIter, Copy, Clone)]
#[serde(crate = "near_sdk::serde")]
pub enum Collection {
    IterableSet,
    IterableMap,
    UnorderedSet,
    UnorderedMap,
    LookupMap,
    LookupSet,
    TreeMap,
    Vector,
}

#[derive(Serialize, Deserialize)]
#[serde(crate = "near_sdk::serde")]
pub enum Op {
    Insert(u32),
    Remove(u32),
    Flush,
    Contains(u32),
    Iter(usize),
}
#[tokio::test]
// TODO: Parallelize
async fn combined_test() -> anyhow::Result<()> {
    let worker = near_workspaces::sandbox().await?;
    let contract = worker.dev_deploy(include_bytes!("test-contracts/store/res/store.wasm")).await?;
    let account: Account = worker.dev_create_account().await?;

    // let mut collection_set = JoinSet::new();
    // insert
    for col in Collection::iter() {
        let account = account.clone();
        let contract = contract.clone();
        // collection_set.spawn(async move {
        let mut total_gas: u64 = 0;
        let mut futures = JoinSet::new();

        for _ in 0..=16 {
            let txn =
                account.call(contract.id(), "exec").args_json((col, Op::Insert(1))).transact();
            futures.spawn(txn);
        }
        while let Some(res) = futures.join_next().await {
            total_gas += res??.total_gas_burnt.as_gas();
        }
        assert!(total_gas < NearGas::from_tgas(100).as_gas(), "performance regression");
        assert!(
            total_gas > NearGas::from_tgas(90).as_gas(),
            "not enough Tgas consumed, adjust the number of iterations to spot regressions"
        );
        // Ok::<_, anyhow::Error>(total_gas)
        // });
    }
    //
    // while let Some(total_gas) = collection_set.join_next().await {
    //     let total_gas = total_gas??;
    //     assert!(total_gas < NearGas::from_tgas(100).as_gas(), "performance regression");
    //     assert!(
    //         total_gas > NearGas::from_tgas(90).as_gas(),
    //         "not enough Tgas consumed, adjust the number of iterations to spot regressions"
    //     )
    // }

    Ok(())
}

This works perfectly fine.

frol commented 3 months ago

Yeah, it is clearly a nonce collision which is a common thing in all the tooling we currently have in NEAR when you need to submit several concurrent transactions from a single account, you have to sign them with difference access keys to avoid such errors:

The best solution would be to fix it on the tooling side instead of letting developers to bump into these issues and rediscover them this way. There is also a proposal to solve it on the protocol level: https://github.com/near/NEPs/pull/522