eqlabs / pathfinder

A Starknet full node written in Rust
https://eqlabs.github.io/pathfinder/
Other
618 stars 227 forks source link

Node X at height Y is missing #2170

Open fracek opened 4 weeks ago

fracek commented 4 weeks ago

Hello, I'm syncing a new node from a fresh install and it eventually stops with the following error. This seems related to #2110 but in this case there is no reorg.

I had the same issue syncing a node starting from the snapshot provided.

2024-08-14T03:58:22  INFO 🏁 Starting node. version="v0.14.1"
2024-08-14T03:58:22  INFO No database migrations required current_revision=62
2024-08-14T03:58:22  INFO Merkle trie pruning enabled history_kept=20
2024-08-14T03:58:22  INFO Database migrated. location="/data/mainnet.sqlite"
2024-08-14T03:58:22  INFO Cleaning up state trie
2024-08-14T03:58:22  INFO 📡 HTTP-RPC server started on: 0.0.0.0:9545
2024-08-14T03:58:23  INFO L1 sync updated to block 668107
2024-08-14T03:58:23 ERROR Sync consumer task terminated with an error reason=Update L2 state to 38912

Caused by:
    0: Updating Starknet state
    1: Update contract storage tree
    2: Node 4367934 at height 18 is missing
2024-08-14T03:58:23  INFO Channel closed, exiting latest poll task
2024-08-14T03:58:23 ERROR Sync process ended unexpected with: Err(Sync process terminated)
Error: Unexpected shutdown

I'm running pathfinder 0.14.1 in Kubernetes in a statefulset. I only set the ethereum url/gateway api key flags and no other flag.

kkovaacs commented 4 weeks ago

I had the same issue syncing a node starting from the snapshot provided.

Do you mean that, starting from our most recent snapshot (mainnet_0.13.0_649680_pruned.sqlite.zst right now) you had the same "Node missing" error after starting the sync?

Does it always fail at the exact same block height?

kkovaacs commented 4 weeks ago

I tried to reproduce this but it seems that I'm not really able to. I've started syncing mainnet from scratch and am now at block 54k without any issues (only Ethereum URL / GW API key was set).

I'm wondering if this might be somehow related to you Kubernetes setup. Can you maybe try reproducing this in a VM? There must be some difference in how we're running pathfinder and it would be great finding out what's the difference...

fracek commented 4 weeks ago

I will try to sync a node againt with RUST_LOG=pathfinder=debug and report back. I had the same issues restoring from snapshot and syncing from scratch, so maybe it is related to my setup.

kkovaacs commented 3 weeks ago

@fracek BTW: what exactly is your setup? What VM / storage are you running on?

kkovaacs commented 3 weeks ago

Also: is the behavior you're experiencing specific to Pathfinder 0.14.1?