NethermindEth / nethermind

A robust execution client for Ethereum node operators.
https://nethermind.io/nethermind-client
GNU General Public License v3.0
1.27k stars 438 forks source link

State sync stuck at 100% #4458

Closed dcposch closed 8 months ago

dcposch commented 2 years ago

Describe the bug

I'm unable to get my Nethermind node to sync properly.

History:

It appears to be stuck at 100% completion on state sync?

Aug 24 01:15:52 dce-1T Nethermind.Runner[33442]: 2022-08-24 01:15:52.0927|State Sync 00.21:11:52 | ~100.00% | 109388.08MB / ~99660.00MB | branches: 0.00% | kB/s:     0 | accounts 188272098 | nodes 1214166525 | diagnostics: 0.145.19ms

Zooming out from the sync bug--it would be very helpful if Nethermind had a master status that prints, say, every 10 seconds. If synced and healthy, print that. If not, print exactly what it's currently attempting to do to get in sync.

To Reproduce Steps to reproduce the behavior:

  1. Rent a server with 1T SSD.
  2. Install Ubuntu 22.04
  3. Install latest Nethermind via apt per these instructions.

Expected behavior Syncs mainnet.

Version

dcposch commented 2 years ago

Still stuck at block 14990417.

Changing state Disconnected to None at processed:14990417|state:14990417|block:15411005|header:15411240|chain difficulty:57195585512940781640941|peer block:15379000|peer total difficulty:56814286654281214516990

Context:

Aug 26 00:03:10 dce-1T Nethermind.Runner[94098]: 2022-08-26 00:03:10.6900|Waiting for peers... 21s
Aug 26 00:03:11 dce-1T Nethermind.Runner[94098]: 2022-08-26 00:03:11.6894|Waiting for peers... 22s
Aug 26 00:03:11 dce-1T Nethermind.Runner[94098]: 2022-08-26 00:03:11.8168|Changing state Disconnected to None at processed:14990417|state:14990417|block:15411005|header:15411240|chain difficulty:57195585512940781640941|peer block:15379000|peer total difficulty:56814286654281214516990
Aug 26 00:03:11 dce-1T Nethermind.Runner[94098]: 2022-08-26 00:03:11.8168|Sync mode changed from Disconnected to None
dcposch commented 2 years ago

It looks like a number of nodes are stuck at 14990417

Very strange.

Nethermind does look like it's successfully downloading newer blocks, then downloading state for those newer blocks.

1701341 Aug 29 11:27:18 dce-1T Nethermind.Runner[200183]: 2022-08-29 11:27:18.1986|Branch sync progress (do not extrapolate): 100.00% of block 15433302
1701342 Aug 29 11:27:18 dce-1T Nethermind.Runner[200183]: ++1+++++++++1++++++1+1+++++++++1+1+++++++1+++++++++++++1++++++++
1701343 Aug 29 11:27:18 dce-1T Nethermind.Runner[200183]: ++1+++++++++1++111++++++++11++++++++++++++++++++++++++1+++++++++
1701344 Aug 29 11:27:18 dce-1T Nethermind.Runner[200183]: +++++++++++11+++++++++++++++++++++++1++++++++1++++++1++++++11+++
1701345 Aug 29 11:27:18 dce-1T Nethermind.Runner[200183]: ++++++++++++++++++++++++1+++++++++++++++++++1++++++1++++++++++++
1701346 Aug 29 11:27:18 dce-1T Nethermind.Runner[200183]: 2022-08-29 11:27:18.1986|Saving root 0x3dc50f0c3d4cb82039361502955d6de6861f309f8395e02d845c368cdf4804a4 of 15433302
1701347 Aug 29 11:27:18 dce-1T Nethermind.Runner[200183]: 2022-08-29 11:27:18.2108|Setting state sync state root to 15433302 0x3dc50f0c3d4cb82039361502955d6de6861f309f8395e02d845c36
1701348 Aug 29 11:27:18 dce-1T Nethermind.Runner[200183]: 2022-08-29 11:27:18.8576|Changing state StateNodes to Full at processed:14990417|state:15433302|block:15433268|header:15433
1701349 Aug 29 11:27:18 dce-1T Nethermind.Runner[200183]: 2022-08-29 11:27:18.8576|Sync mode changed from StateNodes to Full
1701350 Aug 29 11:27:32 dce-1T Nethermind.Runner[200183]: 2022-08-29 11:27:32.8208|Peers | with known best block: 99 | all: 99 |

However, it still reports block 14990417 as the tip.

What's strange is a number of peers are also stuck on this block. Those peers appear to be running geth, not nethermind.

Highlighted below:

image

If you need RPC (or other) access to my node for debugging, let me know.

MarekM25 commented 2 years ago

Hmmm I'm wondering if it is possible that it is connected to gray glacier: https://blog.ethereum.org/2022/06/16/gray-glacier-announcement

Would you mind verifying your chainspec (setting Init.ChainSpecPath)?

dcposch commented 2 years ago

I didn't override that setting. Looking at /usr/share/nethermind/configs/mainnet.cfg, I see that it's defaulting to chainspec/foundation.json.

Looking at that file, I can see it has the terminalTotalDifficulty set, which means it must be a recent configuration corresponding to my recent Nethermind version (see exact version string above).

$ less chainspec/foundation.json
...
    "terminalTotalDifficulty": "C70D808A128D7380000"
dcposch commented 2 years ago

Still stuck at 14990417.

My chainspec looks correct to me. The path is /chainspec/foundation.json:

image
dcposch commented 2 years ago

@MarekM25 I'm going to delete my deata directory and sync from scratch. I've saved logs in case you want them for inspection.

Nethermind is receiving new blocks, but unable to progress past block 14990417. It seems state sync is failing:

Sep 12 19:24:46 dce-1T Nethermind.Runner[776631]: 2022-09-12 19:24:44.7461|Discovered new block 15522104 19:24:30 (0x431977...aca4d5), tx count: 127 miner 0x8b4de256180cfec54c436a470af50f9ee2813dbb, sent by [Peer|eth66|15522104| 100.37.162.177:30303]
Sep 12 19:24:46 dce-1T Nethermind.Runner[776631]: 2022-09-12 19:24:45.9916|State Sync 06.07:13:41 | ~100.00% | 159778.43MB / ~101060.00MB | branches: 0.00% | kB/s:   104 | accounts 205258631 | nodes 1390523791 | diagnostics: 8.281.30ms

Not sure what the perennial branches: 0.00% is about.