near / nearcore

Reference client for NEAR Protocol
https://near.org
GNU General Public License v3.0
2.3k stars 600 forks source link

MissingTrieValue in memtries-to-disktries switch nayduck test #11702

Open tayfunelmas opened 3 days ago

tayfunelmas commented 3 days ago

This happens when running the new nayduck test in https://github.com/near/nearcore/pull/11676: 1) Start nodes with memtries enabled. 2) Restart nodes with memtries disabled. 3) Restart nodes with memtries enabled. At the last step, some nodes hits the following failure:

2024-07-02T12:11:04.717115Z  INFO neard: version="trunk" build="1.36.1-859-gd2ee23dca-modified" latest_protocol=82
2024-07-02T12:11:04.718228Z  INFO config: Validating Config, extracted from config.json...
2024-07-02T12:11:04.718774Z  WARN genesis: Skipped genesis validation
2024-07-02T12:11:04.718804Z  INFO config: Validating Genesis config and records. This could take a few minutes...
2024-07-02T12:11:04.719217Z  INFO config: All validations have passed!
2024-07-02T12:11:04.723420Z  INFO neard: Changing the config "/home/elmas/.near/test2/log_config.json". config=LogConfig { rust_log: None, verbose_module: None, opentelemetry: None }
2024-07-02T12:11:04.723711Z  INFO config: Validating Config, extracted from config.json...
2024-07-02T12:11:04.723758Z  INFO neard: Hot loading validator key /home/elmas/.near/test2/validator_key.json.
2024-07-02T12:11:04.723823Z  INFO near_o11y::reload: Updated the logging layer according to `log_config.json`
2024-07-02T12:11:04.723858Z  INFO db_opener: Opening NodeStorage path="/home/elmas/.near/test2/data" cold_path="none"
2024-07-02T12:11:04.723873Z  INFO db: Opened a new RocksDB instance. num_instances=1
2024-07-02T12:11:04.729873Z  INFO db: Closed a RocksDB instance. num_instances=0
2024-07-02T12:11:04.729892Z  INFO db_opener: The database exists. path=/home/elmas/.near/test2/data
2024-07-02T12:11:04.729900Z DEBUG db_opener: Ensure db kind is correct and set. path=/home/elmas/.near/test2/data archive=false which="Hot"
2024-07-02T12:11:04.729912Z  INFO db: Opened a new RocksDB instance. num_instances=1
2024-07-02T12:11:05.005512Z  INFO db: Closed a RocksDB instance. num_instances=0
2024-07-02T12:11:05.005554Z DEBUG db_opener: Ensure db version path=/home/elmas/.near/test2/data
2024-07-02T12:11:05.005578Z  INFO db: Opened a new RocksDB instance. num_instances=1
2024-07-02T12:11:05.008453Z  INFO db: Closed a RocksDB instance. num_instances=0
2024-07-02T12:11:05.008492Z  INFO db: Opened a new RocksDB instance. num_instances=1
2024-07-02T12:11:05.046372Z DEBUG metrics: Spawning the db metrics loop.
2024-07-02T12:11:05.046712Z DEBUG metrics: Spawning the trie metrics loop.
2024-07-02T12:11:05.046721Z DEBUG metrics: Starting the db metrics loop.
2024-07-02T12:11:05.047125Z DEBUG metrics: Starting the spawn metrics loop.
2024-07-02T12:11:05.047465Z DEBUG vm: path=/home/elmas/.near/test2/data/contracts opened a contract executable cache directory
2024-07-02T12:11:05.087239Z DEBUG runtime: The state snapshot is not available. err=STATE_SNAPSHOT_KEY
2024-07-02T12:11:05.087290Z DEBUG cold_store: Not spawning cold store because TrieChanges are not saved
2024-07-02T12:11:05.107504Z DEBUG chain: Computed genesis congestion info. shard_id=0 state_root=11111111111111111111111111111111 congestion_info=V1(CongestionInfoV1 { delayed_receipts_gas: 0, buffered_receipts_gas: 0, receipt_bytes: 0, allowed_shard: 0 })
2024-07-02T12:11:05.107572Z DEBUG chain: Computed genesis congestion info. shard_id=1 state_root=11111111111111111111111111111111 congestion_info=V1(CongestionInfoV1 { delayed_receipts_gas: 0, buffered_receipts_gas: 0, receipt_bytes: 0, allowed_shard: 1 })
2024-07-02T12:11:05.107701Z DEBUG chain: Computed genesis congestion info. shard_id=2 state_root=8YijkEqArew3cMs7MoeZbbnqPEP8LbCUN85qVb668V6i congestion_info=V1(CongestionInfoV1 { delayed_receipts_gas: 0, buffered_receipts_gas: 0, receipt_bytes: 0, allowed_shard: 2 })
2024-07-02T12:11:05.107832Z ERROR chain: Failed to get the genesis congestion infos. err=StorageError(MissingTrieValue(TrieStorage, ATp7pz7jQYAJm233fNfgX8ZNLadATGYiMAsVfxtLUfw8))
thread 'main' panicked at neard/src/cli.rs:564:14:
start_with_config: Storage Error: MissingTrieValue(TrieStorage, ATp7pz7jQYAJm233fNfgX8ZNLadATGYiMAsVfxtLUfw8)

Caused by:
    MissingTrieValue(TrieStorage, ATp7pz7jQYAJm233fNfgX8ZNLadATGYiMAsVfxtLUfw8)
stack backtrace:
   0: rust_begin_unwind
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:652:5
   1: core::panicking::panic_fmt
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/panicking.rs:72:14
   2: core::result::unwrap_failed
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/result.rs:1654:5
   3: core::result::Result<T,E>::expect
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/result.rs:1034:23
   4: neard::cli::RunCmd::run::{{closure}}
             at /home/elmas/near/GitHub/tayfunelmas/nearcore/neard/src/cli.rs:558:17
   5: <tokio::task::local::RunUntil<T> as core::future::future::Future>::poll::{{closure}}
             at /home/elmas/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.29.1/src/task/local.rs:923:42
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
tayfunelmas commented 2 days ago

Stack trace:

2024-07-02T16:02:54.873041Z DEBUG node_runtime::congestion_control: Loading DelayedReceiptQueue
thread 'main' panicked at core/store/src/trie/trie_storage.rs:560:10:
called `Option::unwrap()` on a `None` value
stack backtrace:
   0: rust_begin_unwind
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:652:5
   1: core::panicking::panic_fmt
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/panicking.rs:72:14
   2: core::panicking::panic
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/panicking.rs:146:5
   3: core::option::unwrap_failed
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/option.rs:1984:5
   4: core::option::Option<T>::unwrap
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/option.rs:932:21
   5: near_store::trie::trie_storage::read_node_from_db
             at /home/elmas/near/GitHub/tayfunelmas/nearcore/core/store/src/trie/trie_storage.rs:557:15
   6: near_store::trie::trie_storage::TrieCachingStorage::read_from_db
             at /home/elmas/near/GitHub/tayfunelmas/nearcore/core/store/src/trie/trie_storage.rs:566:9
   7: near_store::trie::trie_storage::TrieCachingStorage::read_for_shard_cache_miss
             at /home/elmas/near/GitHub/tayfunelmas/nearcore/core/store/src/trie/trie_storage.rs:441:20
   8: <near_store::trie::trie_storage::TrieCachingStorage as near_store::trie::trie_storage::TrieStorage>::retrieve_raw_bytes
             at /home/elmas/near/GitHub/tayfunelmas/nearcore/core/store/src/trie/trie_storage.rs:515:25
   9: near_store::trie::accounting_cache::TrieAccountingCache::retrieve_raw_bytes_with_accounting
             at /home/elmas/near/GitHub/tayfunelmas/nearcore/core/store/src/trie/accounting_cache.rs:120:24
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.