skalenetwork / skaled

Running more than 20 production blockchains, SKALED is Ethereum-compatible, high performance C++ Proof-of-Stake client, tools and libraries. Uses SKALE consensus as a blockchain consensus core. Includes dynamic Oracle. Implements file storage and retrieval as an EVM extension.
https://skale.network
GNU General Public License v3.0
84 stars 40 forks source link

Call to the latest block throws AttemptToReadFromStateInThePast error #1956

Open DmytroNazarenko opened 2 months ago

DmytroNazarenko commented 2 months ago

Description

While making eth_call to the latest block, the error Invalid RPC parameters has appeared:

>>> curl https://mainnet.skalenodes.com/v1/light-vast-diphda -X POST -H "Content-Type: application/json"   --data '{"method":"eth_call","params":[{"to":"0xd2AAa00100000000000000000000000000000000","data":"0x999ab9aa000000000000000000000000000000000000000000000000000000000000002000000000000000000000000000000000000000000000000000000000000000166861756e74696e672d6465766f7465642d64656e656200000000000000000000"}, "latest"],"id":1,"jsonrpc":"2.0"}'
<<< {"error":{"code":-32004,"message":"Invalid RPC parameters."},"id":1,"jsonrpc":"2.0"}

The Invalid RPC parameters error was caused by the exception in skaled: AttemptToReadFromStateInThePast

  | Aug 9, 2024 @ 13:11:44.907 | 2024-08-09 12:11:44.506256   Current state version is 4409055 but stored version is 4409063
  | Aug 9, 2024 @ 13:11:44.907 | Dynamic exception type: boost::exception_detail::clone_impl<skale::error::AttemptToReadFromStateInThePast> 
  | Aug 9, 2024 @ 13:11:44.907 | 2024-08-09 12:11:44.506378   exception in client call(2):/home/s5/actions-runner-3/_work/skaled/skaled/libskale/State.cpp(442): Throw in function dev::eth::Account* skale::State::account(const Address&) 

Need to investigate and fix the problem

Environment

Logs

Discover search [2024-08-09T13_50_21.265+01_00].csv

dimalit commented 2 months ago

Root cause: racing condition between two lines here:

        Block temp = preSeal();
        State readStateForLock = temp.mutableState().createStateReadOnlyCopy();

Way to reproduce:

  1. add sleep

        Block temp = preSeal();
        LOG( m_loggerDetail ) << "got preSeal: verion = " << *( temp.state().m_storedVersion );
    
        this_thread::sleep_for( 2s );
    
        // TODO there can be race conditions between prev and next line!
        State readStateForLock = temp.mutableState().createStateReadOnlyCopy();
  2. send call and then immediately send transaction
    
    2024-08-14 12:08:44.504968   Current state version is 2 but stored version is 3
    2024-08-14 12:08:44.505041   exception in client call(2):/home/dimalit/skaled/libskale/State.cpp(457): Throw in function dev::eth::Account* skale::State::account(const Address&)
    Dynamic exception type: boost::exception_detail::clone_impl<skale::error::AttemptToReadFromStateInThePast>

2024-08-14 12:08:44.505370 http://127.0.0.1:55250 <<< {"error":{"code":-32004,"message":"Invalid RPC parameters."},"id":1,"jsonrpc":"2.0"} 2024-08-14 12:08:44.505459 Performance warning: 2.002512 seconds execution time for eth_call call with id=1 when called from origin http://127.0.0.1:55250 through server with index=0

dimalit commented 2 months ago

Unit test that reproduces:


BOOST_AUTO_TEST_CASE( RaceOnVersion ) {
    Address addr{"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"};
    State base( 0 );

    State copy = base;

    thread writer([&base, addr](){
        State w = base.createStateModifyCopy();
        w.incNonce( addr );
        w.commit(dev::eth::CommitBehaviour::RemoveEmptyAccounts );
    });

    this_thread::sleep_for(1s);
    BOOST_REQUIRE_THROW( copy.getNonce(addr), skale::error::AttemptToReadFromStateInThePast );
    writer.join();
}

(StateUnitTests.cpp)

dimalit commented 2 months ago

Test with snapshot-based State:

BOOST_AUTO_TEST_CASE( RaceOnVersion ) {
    Address addr{ "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" };
    State base( 0, "/tmp/state", h256() );
    base.createReadOnlyStateDBSnap( 0 );

    State copy = base.createReadOnlySnapBasedCopy();

    thread writer( [&base, addr]() {
        base.incNonce( addr );
        base.commit( dev::eth::CommitBehaviour::RemoveEmptyAccounts );
    } );

    this_thread::sleep_for( 1s );
    copy.getNonce( addr );
    writer.join();
}
valgrind --tool=helgrind ./testeth -t StateUnitTests/RaceOnVersion -- --verbosity 3 --express

(no errors reported)

DmytroNazarenko commented 2 months ago

Per @dimalit : the problem should be fixed by https://github.com/skalenetwork/skaled/issues/1545