near / nearcore

Reference client for NEAR Protocol
https://near.org
GNU General Public License v3.0
2.31k stars 615 forks source link

[ProjectTracking] Recovery of archival node missing data #11895

Open Trisfald opened 1 month ago

Trisfald commented 1 month ago

Summary

In early 2024 archival nodes failed to persist a subset of data from hot to cold storage. As a consequence, after five epoch such data has been deleted from the DB and lost.

The root cause causes for the failure are mainly two: erroneous manual operations on nodes and issues during the two resharding procedures.

Action plan

Trisfald commented 1 month ago

Problematic heights

Operational issue(s): block 109913255

First resharding: block 114580308

Second resharding: block 115185108

Known failing queries

Height 109913260

JSON query:

curl -X POST https://archival-rpc.mainnet.near.org \
        -H "Content-Type: application/json" \
        -H "Referer: https://beta.rpc.mainnet.near.org" \
        -d '
        { "id": "dontcare", "jsonrpc": "2.0", "method": "query", "params": { "account_id": "b001b461c65aca5968a0afab3302a5387d128178c99ff5b2592796963407560a", "block_id": 109913260, "request_type": "view_account" } }'

Storage query:

./neard view-state -t cold view-trie --shard-id 2 --shard-version 1 --max-depth 1000 --hash 36SkUU8tgetUtVL2a5JPwKB6F29yKBFjF5PFukZ8HRFH --from b001aea591ef68681e59a4149b1ab8bc56d8f22e34be24 --to b001c0de4c6929c5289b65044249830466ffea27680bc1 --format pretty --record-type account

Height 114580308

JSON query:

curl -X POST https://archival-rpc.mainnet.near.org \
        -H "Content-Type: application/json" \
        -H "Referer: https://beta.rpc.mainnet.near.org" \
        -d '
        { "id": "dontcare", "jsonrpc": "2.0", "method": "query", "params": { "account_id": "token2.near", "block_id": 114580308, "request_type": "view_account" } }'

Storage query:

./neard view-state -t cold view-trie --shard-id 4 --shard-version 2 --max-depth 1000 --hash Fe7oLHaqNq5kWnNkDdZatWRY8CRHzBvBKbeACt8JKQsr  --from token1.near -
-to token3.near --format pretty --record-type account

Height 115185110

JSON query:

curl -X POST https://archival-rpc.mainnet.near.org \
        -H "Content-Type: application/json" \
        -H "Referer: https://beta.rpc.mainnet.near.org" \
        -d '
        { "id": "dontcare", "jsonrpc": "2.0", "method": "query", "params": { "account_id": "timpanic.tg", "block_id": 115185110, "request_type": "view_account" } }'

Storage query:

./neard view-state -t cold view-trie --shard-id 5 --shard-version 3 --max-depth 1000 --hash HucDNVVACPC59SQW9SSmfao5tNjqiaFgZ5mvUrW4xVr3  --from timp8b4kqpff.users.kaiching --to timpanium.tg --format pretty --record-type account

Other failing queries

curl -X POST https://archival-rpc.mainnet.near.org \
        -H "Content-Type: application/json" \
        -H "Referer: https://beta.rpc.mainnet.near.org" \
        -d '
{ "jsonrpc": "2.0", "id": "dontcare", "method": "query", "params": { "request_type": "call_function", "finality": "final", "account_id": "bisontrails.poolv1.near", "block_id": 114580308, "method_name": "get_reward_fee_fraction", "args_base64": "" }}'
curl -X POST https://archival-rpc.mainnet.near.org \
        -H "Content-Type: application/json" \
        -H "Referer: https://beta.rpc.mainnet.near.org" \
        -d '
{ "jsonrpc": "2.0", "id": "dontcare", "method": "query", "params": { "request_type": "call_function", "account_id": "consensus_finoa_00.poolv1.near", "block_id": 120308219, "method_name": "get_account", "args_base64": "eyJhY2NvdW50X2lkIjoicmVzdGFrZS5uZWFyIn0=" }}'
walnut-the-cat commented 1 week ago

Aug 30th report: