near / stakewars-iv

12 stars 9 forks source link

Can't sync back to state on statelessnet #63

Closed gritsly closed 6 months ago

gritsly commented 6 months ago

Bug Report

Overview

I guess i must've missed a protocol update as i got 0 chunks in the epoch that began at 14 March. So then i updated and started getting these errors:

Mar 15 10:34:46 near03 neard[4196]: 2024-03-15T10:34:46.293768Z  INFO stats: State 3zT2P4gvcQwxxLaNHexPPSGshmn7PBoP1aqPhED32pyX[0: header][1: header][2: header][3: header][4: header] 36 peers ⬇ 455 kB/s ⬆ 82.2 kB/s 0.00 bps 0 gas/s CPU: 30%, Mem: 737 MB
Mar 15 10:34:52 near03 neard[4196]: 2024-03-15T10:34:52.767287Z ERROR metrics: Error when exporting postponed receipts count DB Not Found Error: BLOCK: 82wGTQMWaybqgeU8UrNPU9BwH4B4oJzPZhpsccYzuevS.
Mar 15 10:34:56 near03 neard[4196]: 2024-03-15T10:34:56.168572Z  WARN sync: State sync didn't download the state, sending StateRequest again shard_id=0 timeout_sec=60
Mar 15 10:39:03 near03 neard[4196]: 2024-03-15T10:39:03.122272Z ERROR client: Received an invalid block during state sync err=DBNotFoundErr("epoch block: FCwWorj9qH3T85twRGpc6UBV8FjtLHXhhgVULgMu5UY6") block_hash=J9MiVjmHpbGMUAfwqA1LbN4yzQvrXEpEG3drD5bg66Xr
Mar 15 10:39:03 near03 neard[4196]: 2024-03-15T10:39:03.122378Z ERROR client: Received an invalid block during state sync err=DBNotFoundErr("epoch block: FCwWorj9qH3T85twRGpc6UBV8FjtLHXhhgVULgMu5UY6") block_hash=J9MiVjmHpbGMUAfwqA1LbN4yzQvrXEpEG3drD5bg66Xr

Affected parties

My Validator (gritsly), though i saw on the group that people are experiencing the same issue

Impact

No block/chunk production, can't sync back to state.

Reproduction steps

I tried 3 time pulling a fresh snapshot and starting the node again, but ends up in the same situation.

I did wait for over 8h once and the situation did not improve.

Maybe removing ledger and pulling a snapshot on the latest branch can reproduce this issue.

Q

Even though i build the sw from the latest of statelessnet branch, the version that it's reporting is this:

near@near03:~/logs$ neard --version
neard (release trunk) (build 1.36.1-298-g984f6ad71) (rustc 1.76.0) (protocol 83) (db 38)

Is that ok?

gritsly commented 6 months ago

putting state_sync_enabled: false and pulling a fresh snap made the node sync.

telezhnaya commented 6 months ago

Should be fixed now @gritsly could you please return back state_sync_enabled: true and confirm everything is fine?

gritsly commented 6 months ago

I put the flag back and restarted and node is working fine.

Not sure if this is the right test though, if my node was on-tip during the restart, does the state sync even trigger then?

Anyway, validating. So i guess we can close this.