near / near-workspaces-rs

Write tests once, run them both on NEAR TestNet and a controlled NEAR Sandbox local environment via Rust
84 stars 48 forks source link

invalid nonce on parallel tests via `transact()` #254

Open Tarnadas opened 1 year ago

Tarnadas commented 1 year ago

Hey,

in my integration tests I have couple of tests that send very many transactions in parallel. It's in fact so many, that I got an OS error, because it could not make that many requests in parallel, so I at most send 1k tx now in parallel and then wait until some of them have finished to send the next batch.

I found some issues (#163, #228), but am not sure how they relate. I use transact() instead of transact_async(). Honestly I have noticed the _async function too late, which is why I still use transact().

The problem is that the bug is not easily reproducible, because it only happens about 1 out of 10 times maybe when running one of these long running tests. The error looks as follows:

Error: handler error: [An error happened during transaction execution: InvalidNonce { tx_nonce: 17002007, ak_nonce: 17002007 }]

To make it run in parallel I spawn async tasks via tokio and put them into a task queue like this:

    let mut tasks: Vec<JoinHandle<anyhow::Result<()>>> = vec![];
    for item in items {
        tasks.push(tokio::spawn(async move {
            // ...
        }));

        if tasks.len() >= 1_000 {
            for task in tasks.drain(..10) {
                task.await??;
            }
        }
    }

    for task in tasks {
        task.await??;
    }

I might be able to give you the full repo soon™, since we're about to open source it at some point.

If you wan't to I can try to make it run with transact_async(), but it's quite some refactoring that I need to do for this to work, because in my integration tests it checks dynamically whether a transaction can be async or not based on a scenario I give it as an input, which is defined in a struct like so:

pub struct TestCase<'a> {
    pub base_token: Token<'a>,
    pub quote_token: Token<'a>,
    pub accounts: HashMap<AccountId, TestAccount>,
    pub orders: Vec<ExecutableOrderWithOutcome>,
}

pub struct ExecutableOrderWithOutcome {
    pub sender_id: AccountId,
    pub order: ExecutableOrder,
    pub outcome: Option<ExpectedOutcome>,
}

If ExecutableOrderWithOutcome#outcome is None, then it can run in parallel, otherwise it waits until all tx have finished up until that point

ChaoticTempest commented 1 year ago

So a couple questions first:

Also, the following processes transactions sequentially anyways for tokio spawned tasks:

    for task in tasks {
        task.await??;
    }

You can try using futures::future::join_all(tasks).await; instead for polling the tasks in intervals, and see if that helps. It's kinda weird that bug is presenting itself for sequential tasks

Tarnadas commented 1 year ago

I'm still @workspaces 0.6.1, so that might possibly fix it. Sorry for not checking that I'm on the latest version before opening a ticket.

I'm on Linux Manjaro 22.

I did not specify worker threads, so it should be 16, 1 for each CPU.

The tasks are not run sequentially, since they are spawned via tokio::spawn or am I missing something? Otherwise a test would last forever.

ChaoticTempest commented 1 year ago

I'm still @workspaces 0.6.1, so that might possibly fix it. Sorry for not checking that I'm on the latest version before opening a ticket.

No worries, let me know if that fixes things. If not I'll take a deeper look when i get the chance.

The tasks are not run sequentially, since they are spawned via tokio::spawn or am I missing something? Otherwise a test would last forever.

nevermind about this one. They should be running in the background.

Tarnadas commented 1 year ago

So with workspaces >= 0.7 this error should never be displayed where tx_nonce and ak_nonce are the same? Because I just got this failure after workspaces update

Error: unable to broadcast the transaction to the network

Caused by:
    handler error: [An error happened during transaction execution: InvalidNonce { tx_nonce: 68000016, ak_nonce: 68000016 }]
ChaoticTempest commented 1 year ago

@Tarnadas hmm, now that's quite peculiar. If you don't mind, can I take a look at your repo to see what's up? And this is still for calls to transact correct?

Tarnadas commented 1 year ago

Hey, sorry for late reply. I'm still looking into making this open source and will keep you updated once it's done.

Yes this is using transact

ghost commented 10 months ago

Does this issue still persist @Tarnadas ?