paritytech / substrate

Substrate: The platform for blockchain innovators
Apache License 2.0
8.39k stars 2.65k forks source link

Discarded block hash returned by getBlockHash #9660

Closed crystalin closed 3 years ago

crystalin commented 3 years ago

Issue

(This is for parachain and might be cumulus related)

Block hash returned by some nodes doesn't match the block on chain. (The block was best and finalized (long) before the rpc call was made.)

For Moonriver block height 427541, the block hash should be 0x3c376028c4f9b1b3f3fad730e8fd0a2965869724cacd0317686f72a7dccc56cd polkadotjs , but some nodes also return 0x5c2058a7083edfd6037cb866f1e2fd73fd903f62986e94d57aa2fb169d5233e5 polkadotjs

Querying the public node:

└────╼ curl 'https://rpc.moonriver.moonbeam.network' -d '{"id":1,"jsonrpc":"2.0","method":"chain_getBlockHash","params":[427541]'} -H 'Content-Type: application/json'
{"jsonrpc":"2.0","result":"0x5c2058a7083edfd6037cb866f1e2fd73fd903f62986e94d57aa2fb169d5233e5","id":1}

└────╼ curl 'https://rpc.moonriver.moonbeam.network' -d '{"id":1,"jsonrpc":"2.0","method":"chain_getBlock","params":["0x5c2058a7083edfd6037cb866f1e2fd73fd903f62986e94d57aa2fb169d5233e5"]'} -H 'Content-Type: application/json'
{"jsonrpc":"2.0","result":{"block":{"extrinsics":["...],"header":{"number":"0x68615","digest":{"logs":[...]},"extrinsicsRoot":"0xbb9d2c37c4a0d4d860db24f63592ceb8b4326e83b8c870d10390e806a9a8ae64","parentHash":"0x4d6419ffa9bc491edfaaff046e6444ae6b91a075acb3e6cf1bbbfd0133f56ada","stateRoot":"0xce084c8ce336291f3d809d117fe2b4d2630a4b59948c4107b266d52876e9804d"}},"justifications":null},"id":1}

└────╼ curl 'https://rpc.moonriver.moonbeam.network' -d '{"id":1,"jsonrpc":"2.0","method":"chain_getBlock","params":["0x3c376028c4f9b1b3f3fad730e8fd0a2965869724cacd0317686f72a7dccc56cd"]'} -H 'Content-Type: application/json'
{"jsonrpc":"2.0","result":{"block":{"extrinsics":[...],"header":{"number":"0x68615","digest":{"logs":[...]},"extrinsicsRoot":"0x28d2a3193513c3f8f23e9ec4bb98e4ffaec3a1121793eebd54e706c1d17123b6","parentHash":"0x4d6419ffa9bc491edfaaff046e6444ae6b91a075acb3e6cf1bbbfd0133f56ada","stateRoot":"0xe79958aae9ceaf0f94a28c7e597e42f811914299cbbca5ce5ec1e71ab10fc6ef"}},"justifications":null},"id":1} 

(Both blocks are available for height 0x68615 (427541))

On a freshly synced node:

└────╼ curl 'http://localhost:56991' -d '{"id":1,"jsonrpc":"2.0","method":"chain_getBlockHash","params":[427541]'} -H 'Content-Type: application/json'
{"jsonrpc":"2.0","result":"0x3c376028c4f9b1b3f3fad730e8fd0a2965869724cacd0317686f72a7dccc56cd","id":1}

└────╼ curl 'http://localhost:56991' -d '{"id":1,"jsonrpc":"2.0","method":"chain_getBlock","params":["0x3c376028c4f9b1b3f3fad730e8fd0a2965869724cacd0317686f72a7dccc56cd"]'} -H 'Content-Type: application/json'
{"jsonrpc":"2.0","result":{"block":{"extrinsics":[...],"header":{"number":"0x68615","digest":{"logs":[...]},"extrinsicsRoot":"0x28d2a3193513c3f8f23e9ec4bb98e4ffaec3a1121793eebd54e706c1d17123b6","parentHash":"0x4d6419ffa9bc491edfaaff046e6444ae6b91a075acb3e6cf1bbbfd0133f56ada","stateRoot":"0xe79958aae9ceaf0f94a28c7e597e42f811914299cbbca5ce5ec1e71ab10fc6ef"}},"justifications":null},"id":1} 

└────╼ curl 'http://localhost:56991' -d '{"id":1,"jsonrpc":"2.0","method":"chain_getBlock","params":["0x3c376028c4f9b1b3f3fad730e8fd0a2965869724cacd0317686f72a7dccc56cd"]'} -H 'Content-Type: application/json'
{"jsonrpc":"2.0","result":{"block":{"extrinsics":[...],"header":{"number":"0x68615","digest":{"logs":[...]},"extrinsicsRoot":"0x28d2a3193513c3f8f23e9ec4bb98e4ffaec3a1121793eebd54e706c1d17123b6","parentHash":"0x4d6419ffa9bc491edfaaff046e6444ae6b91a075acb3e6cf1bbbfd0133f56ada","stateRoot":"0xe79958aae9ceaf0f94a28c7e597e42f811914299cbbca5ce5ec1e71ab10fc6ef"}},"justifications":null},"id":1} 

Context

Events

This happens when querying through the RPC, the getBlockHash for a given height returns a different hash than expected.

From the logs, we can see that multiple blocks were imported for the given height, one of them being the one returned by the node, but not matching the one chosen to be on chain.

Logs filtering the block 427541 for one of the faulty node

``   | Aug 30, 2021 @ 08:36:38.794 | rpcnode1 | 2021-08-30 12:36:38 [🌗] 💤 Idle (47 peers), best: #427543 (0x6e31…b287), finalized #427541 (0x3c37…56cd), ⬇ 31.5kiB/s ⬆ 20.8kiB/s   | Aug 30, 2021 @ 08:36:18.793 | rpcnode1 | 2021-08-30 12:36:18 [🌗] 💤 Idle (47 peers), best: #427541 (0x3c37…56cd), finalized #427540 (0x4d64…6ada), ⬇ 1.7kiB/s ⬆ 2.5kiB/s   | Aug 30, 2021 @ 08:36:13.792 | rpcnode1 | 2021-08-30 12:36:13 [🌗] 💤 Idle (47 peers), best: #427541 (0x3c37…56cd), finalized #427540 (0x4d64…6ada), ⬇ 3.6kiB/s ⬆ 8.2kiB/s   | Aug 30, 2021 @ 08:36:08.792 | rpcnode1 | 2021-08-30 12:36:08 [🌗] 💤 Idle (48 peers), best: #427541 (0x3c37…56cd), finalized #427539 (0x82b5…8b93), ⬇ 37.5kiB/s ⬆ 255.2kiB/s   | Aug 30, 2021 @ 08:35:55.392 | rpcnode1 | 2021-08-30 12:35:55 [🌗] ✨ Imported #427541 (0x3c37…56cd)   | Aug 30, 2021 @ 08:35:55.307 | rpcnode1 | 2021-08-30 12:35:55 [🌗] ✨ Imported #427541 (0x5c20…33e5)   | Aug 30, 2021 @ 08:35:55.223 | rpcnode1 | 2021-08-30 12:35:55 [🌗] ✨ Imported #427541 (0x2fee…7a48)   | Aug 30, 2021 @ 08:35:54.545 | rpcnode1 | 2021-08-30 12:35:54 [Relaychain] ✨ Imported #9006619 (0x4c52…1547) | Aug 30, 2021 @ 08:35:54.489 | rpcnode1 | 2021-08-30 12:35:54 [Relaychain] ✨ Imported #9006619 (0x46c7…408c)   | Aug 30, 2021 @ 08:35:54.455 | rpcnode1 | 2021-08-30 12:35:54 [Relaychain] ✨ Imported #9006619 (0xc4e8…ed93)

bkchr commented 3 years ago

@arkpar any idea? Sounds like a broken db?

crystalin commented 3 years ago

@bkchr As the blocks (the canonical and non-canonical ones) are accessible, I'd lead more into the "best" selection system in cumulus more than a db failure no ?

arkpar commented 3 years ago

This does look like a database issue indeed. On reorg the mappings for enacted blocks should be updated here. Cumulus still sets the best block by re-importing the header, right? We should add more logging to reorg handling.

crystalin commented 3 years ago

@arkpar , is there anything we can do to help here? It is very scary to have the node returning the wrong blocks :/

JoshOrndorff commented 3 years ago

Cumulus still sets the best block by re-importing the header, right?

Is that true even when best changes because of a reorg @bkchr?

arkpar commented 3 years ago

@crystalin Full logs around 427541 and a few following blocks would be nice. There should be a record of a re-org there.

notlesh commented 3 years ago

Here are logs from my own node which exhibited the same problem:

2021-08-30 06:35:37 [🌗] 💤 Idle (40 peers), best: #427538 (0x4774…e1b5), finalized #427536 (0x1368…3639), ⬇ 27.9kiB/s ⬆ 13.8kiB/s
2021-08-30 06:35:37 [Relaychain] ✨ Imported #9006616 (0x0c81…b914)
2021-08-30 06:35:39 [Relaychain] 🔍 Discovered new external address for our node: /ip4/10.110.16.29/tcp/30334/ws/p2p/12D3KooWQNrFAsFiia3wWXyCHKBFjfbBBfany5AVqypXttQFVQee
2021-08-30 06:35:39 [Relaychain] 💤 Idle (25 peers), best: #9006616 (0x0c81…b914), finalized #9006612 (0xe9bb…5aae), ⬇ 262.1kiB/s ⬆ 159.5kiB/s
2021-08-30 06:35:42 [🌗] 💤 Idle (39 peers), best: #427538 (0x4774…e1b5), finalized #427537 (0x60f3…8d54), ⬇ 2.2kiB/s ⬆ 8.2kiB/s
2021-08-30 06:35:42 [Relaychain] ✨ Imported #9006617 (0x9c68…5383)
2021-08-30 06:35:43 [🌗] ✨ Imported #427540 (0x4d64…6ada)
2021-08-30 06:35:44 [Relaychain] 💤 Idle (26 peers), best: #9006617 (0x9c68…5383), finalized #9006613 (0x9d0b…7508), ⬇ 284.5kiB/s ⬆ 290.9kiB/s
2021-08-30 06:35:47 [🌗] 💤 Idle (39 peers), best: #427539 (0x82b5…8b93), finalized #427537 (0x60f3…8d54), ⬇ 15.5kiB/s ⬆ 7.0kiB/s
2021-08-30 06:35:48 [Relaychain] ✨ Imported #9006618 (0xa415…dc2f)
2021-08-30 06:35:49 [Relaychain] 💤 Idle (26 peers), best: #9006618 (0xa415…dc2f), finalized #9006615 (0x393d…1af8), ⬇ 216.4kiB/s ⬆ 114.5kiB/s
2021-08-30 06:35:52 [🌗] 💤 Idle (39 peers), best: #427539 (0x82b5…8b93), finalized #427538 (0x4774…e1b5), ⬇ 1.2kiB/s ⬆ 2.0kiB/s
2021-08-30 06:35:54 [Relaychain] 💤 Idle (27 peers), best: #9006618 (0xa415…dc2f), finalized #9006615 (0x393d…1af8), ⬇ 168.1kiB/s ⬆ 88.4kiB/s
2021-08-30 06:35:54 [Relaychain] ✨ Imported #9006619 (0xc4e8…ed93)
2021-08-30 06:35:55 [Relaychain] ✨ Imported #9006619 (0x4c52…1547)
2021-08-30 06:35:55 [Relaychain] ✨ Imported #9006619 (0x46c7…408c)
2021-08-30 06:35:56 [🌗] ✨ Imported #427541 (0x5c20…33e5)
2021-08-30 06:35:56 [🌗] ✨ Imported #427541 (0x3c37…56cd)
2021-08-30 06:35:56 [🌗] ✨ Imported #427541 (0x2fee…7a48)
2021-08-30 06:35:57 [🌗] 💤 Idle (39 peers), best: #427540 (0x4d64…6ada), finalized #427538 (0x4774…e1b5), ⬇ 38.0kiB/s ⬆ 4.9kiB/s
2021-08-30 06:35:59 [Relaychain] 💤 Idle (28 peers), best: #9006619 (0xc4e8…ed93), finalized #9006615 (0x393d…1af8), ⬇ 183.7kiB/s ⬆ 193.9kiB/s
2021-08-30 06:36:01 [Relaychain] ✨ Imported #9006620 (0x495b…150a)
2021-08-30 06:36:01 [Relaychain] ✨ Imported #9006620 (0x4703…30a5)
2021-08-30 06:36:01 [Relaychain] ✨ Imported #9006620 (0xa16a…1af0)
2021-08-30 06:36:02 [🌗] 💤 Idle (39 peers), best: #427540 (0x4d64…6ada), finalized #427538 (0x4774…e1b5), ⬇ 10.0kiB/s ⬆ 7.4kiB/s
2021-08-30 06:36:04 [Relaychain] 💤 Idle (28 peers), best: #9006620 (0x495b…150a), finalized #9006616 (0x0c81…b914), ⬇ 206.3kiB/s ⬆ 104.8kiB/s
2021-08-30 06:36:07 [🌗] 💤 Idle (39 peers), best: #427540 (0x4d64…6ada), finalized #427538 (0x4774…e1b5), ⬇ 9.0kiB/s ⬆ 2.2kiB/s
2021-08-30 06:36:07 [Relaychain] ♻️  Reorg on #9006620,0x495b…150a to #9006621,0xefcf…09da, common ancestor #9006618,0xa415…dc2f
2021-08-30 06:36:07 [Relaychain] ✨ Imported #9006621 (0xefcf…09da)
2021-08-30 06:36:07 [🌗] ✨ Imported #427542 (0x9caa…b046)
2021-08-30 06:36:08 [Relaychain] ♻️  Reorg on #9006621,0xefcf…09da to #9006621,0x5fe5…b9b7, common ancestor #9006618,0xa415…dc2f
2021-08-30 06:36:08 [Relaychain] ✨ Imported #9006621 (0x5fe5…b9b7)
2021-08-30 06:36:08 [🌗] ♻️  Reorg on #427541,0x3c37…56cd to #427541,0x5c20…33e5, common ancestor #427540,0x4d64…6ada
2021-08-30 06:36:08 [🌗] ✨ Imported #427542 (0x64cf…19be)
2021-08-30 06:36:09 [Relaychain] 💤 Idle (28 peers), best: #9006621 (0x5fe5…b9b7), finalized #9006617 (0x9c68…5383), ⬇ 197.9kiB/s ⬆ 151.5kiB/s
2021-08-30 06:36:12 [🌗] 💤 Idle (39 peers), best: #427541 (0x3c37…56cd), finalized #427539 (0x82b5…8b93), ⬇ 36.6kiB/s ⬆ 5.1kiB/s
2021-08-30 06:36:13 [Relaychain] ✨ Imported #9006622 (0x21d7…e28f)
2021-08-30 06:36:14 [Relaychain] 💤 Idle (28 peers), best: #9006622 (0x21d7…e28f), finalized #9006618 (0xa415…dc2f), ⬇ 147.2kiB/s ⬆ 141.4kiB/s
2021-08-30 06:36:17 [🌗] 💤 Idle (39 peers), best: #427541 (0x3c37…56cd), finalized #427539 (0x82b5…8b93), ⬇ 0.9kiB/s ⬆ 1.0kiB/s
2021-08-30 06:36:19 [Relaychain] ✨ Imported #9006622 (0x3ec1…4471)
2021-08-30 06:36:19 [Relaychain] 💤 Idle (26 peers), best: #9006622 (0x21d7…e28f), finalized #9006618 (0xa415…dc2f), ⬇ 170.6kiB/s ⬆ 33.9kiB/s
2021-08-30 06:36:22 [🌗] 💤 Idle (39 peers), best: #427541 (0x3c37…56cd), finalized #427540 (0x4d64…6ada), ⬇ 1.1kiB/s ⬆ 2.4kiB/s
2021-08-30 06:36:24 [Relaychain] 💤 Idle (27 peers), best: #9006619 (0x4c52…1547), finalized #9006619 (0x4c52…1547), ⬇ 142.2kiB/s ⬆ 124.9kiB/s
2021-08-30 06:36:26 [Relaychain] ♻️  Reorg on #9006622,0x21d7…e28f to #9006623,0xccec…7077, common ancestor #9006618,0xa415…dc2f
2021-08-30 06:36:26 [Relaychain] ✨ Imported #9006623 (0xccec…7077)
2021-08-30 06:36:26 [🌗] ♻️  Reorg on #427541,0x5c20…33e5 to #427542,0x9caa…b046, common ancestor #427540,0x4d64…6ada
2021-08-30 06:36:26 [🌗] ✨ Imported #427543 (0x6e31…b287)
2021-08-30 06:36:26 [Relaychain] 🔍 Discovered new external address for our node: /ip4/100.125.160.64/tcp/30334/ws/p2p/12D3KooWQNrFAsFiia3wWXyCHKBFjfbBBfany5AVqypXttQFVQee
2021-08-30 06:36:27 [Relaychain] 🔍 Discovered new external address for our node: /ip4/10.2.97.21/tcp/30334/ws/p2p/12D3KooWQNrFAsFiia3wWXyCHKBFjfbBBfany5AVqypXttQFVQee
2021-08-30 06:36:27 [🌗] 💤 Idle (39 peers), best: #427542 (0x9caa…b046), finalized #427540 (0x4d64…6ada), ⬇ 19.0kiB/s ⬆ 12.3kiB/s
2021-08-30 06:36:28 [Relaychain] 🔍 Discovered new external address for our node: /ip4/10.0.1.30/tcp/30334/ws/p2p/12D3KooWQNrFAsFiia3wWXyCHKBFjfbBBfany5AVqypXttQFVQee
2021-08-30 06:36:29 [Relaychain] 🔍 Discovered new external address for our node: /ip4/10.110.16.6/tcp/30334/ws/p2p/12D3KooWQNrFAsFiia3wWXyCHKBFjfbBBfany5AVqypXttQFVQee
2021-08-30 06:36:29 [Relaychain] 🔍 Discovered new external address for our node: /ip4/100.112.129.192/tcp/30334/ws/p2p/12D3KooWQNrFAsFiia3wWXyCHKBFjfbBBfany5AVqypXttQFVQee
2021-08-30 06:36:29 [Relaychain] 💤 Idle (28 peers), best: #9006623 (0xccec…7077), finalized #9006620 (0x4703…30a5), ⬇ 404.7kiB/s ⬆ 158.0kiB/s
2021-08-30 06:36:30 [Relaychain] ✨ Imported #9006624 (0x361c…3c7c)
2021-08-30 06:36:32 [🌗] 💤 Idle (39 peers), best: #427542 (0x9caa…b046), finalized #427540 (0x4d64…6ada), ⬇ 81.5kiB/s ⬆ 26.0kiB/s
2021-08-30 06:36:34 [Relaychain] 💤 Idle (28 peers), best: #9006624 (0x361c…3c7c), finalized #9006620 (0x4703…30a5), ⬇ 501.6kiB/s ⬆ 264.8kiB/s
2021-08-30 06:36:36 [Relaychain] ✨ Imported #9006625 (0x563a…41e7)
2021-08-30 06:36:37 [🌗] 💤 Idle (40 peers), best: #427543 (0x6e31…b287), finalized #427541 (0x3c37…56cd), ⬇ 26.6kiB/s ⬆ 390.7kiB/s

In addition, my node responds to getBlockHash like so:

427540: 0x4d6419ffa9bc491edfaaff046e6444ae6b91a075acb3e6cf1bbbfd0133f56ada
427541: 0x5c2058a7083edfd6037cb866f1e2fd73fd903f62986e94d57aa2fb169d5233e5
427542: 0x9caa779d514cac6d60e51797f88f05a2d5cb839bb6695f039e0810cf21c7b046
427543: 0x6e317f7f8ba00b68d6fec45c9f711fd5c2b6b703dfda43e11b21e71fdb7ab287

Of these, we believe only 427541 is incorrect.

I can query block 0x5c2058a7083edfd6037cb866f1e2fd73fd903f62986e94d57aa2fb169d5233e5 and get a block, querying 0x9caa779d514cac6d60e51797f88f05a2d5cb839bb6695f039e0810cf21c7b046 results in the following error:

RPC-CORE: getBlock(hash?: BlockHash): SignedBlock:: -32603: Unknown error occurred: Client, Application, NotInFinalizedChain
notlesh commented 3 years ago

I have also made a backup of my node's data in case anyone wants to see it.

xlc commented 3 years ago

We have slightly different issue. We don't see different block hash between different nodes, but having finalize blocks got discarded. But I suspect the root cause is similar, relates to handling of reorg or relaychain and finality handling.

Additional logs here https://github.com/AcalaNetwork/Acala/issues/1382#issuecomment-909741211

You can see we both had a relaychian reorg at #9006621 and causes issue on parachain.

JoshOrndorff commented 3 years ago

I made a diagram of the kusama and moonriver blocks around this point in the chain, and showed the re-orgs that @notlesh 's node reports. There are three re-orgs on the relay chain and two on the parachain. (I would be curious to see how it looks on Karura too. @xlc could you provide some polkadot js links to karura blocks around that time especially if there were forks on Karura?)

Here's the original google drawing. Each block is linked to polkadot-js. Feel free to play around with it.

Incorrect block hash

Some of the Kusama blocks at the bottom, I only know about from the posted node logs. I can't find them or their full hash. Does anyone know any more information about these Kusama blocks?

#9,006,620: 0x495b…150a
#9,006,621: 0x5fe5…b9b7
#9,006,622: 0x21d7…e28f
crystalin commented 3 years ago

I've done a scan of multiple nodes on Moonriver, and only block 427541 reports the wrong block hash, and 427542 returns an error.

xlc commented 3 years ago

https://polkadot.js.org/apps/?rpc=wss%3A%2F%2Fkarura-rpc-1.aca-api.network#/explorer/query/438151 this is the one we having issue

Some relevant logs

2021-08-30 12:36:19.947  INFO tokio-runtime-worker substrate: [Parachain] 💤 Idle (55 peers), best: #438150 (0x483f…3334), finalized #438149 (0x75e3…07b5), ⬇ 1.6kiB/s ⬆ 2.5kiB/s
2021-08-30 12:36:19.875  INFO tokio-runtime-worker substrate: [Relaychain] ✨ Imported #9006622 (0x3ec1…4471)
2021-08-30 12:36:19.875  INFO tokio-runtime-worker sc_informant: [Relaychain] ♻️  Reorg on #9006621,0x5fe5…b9b7 to #9006622,0x3ec1…4471, common ancestor #9006618,0xa415…dc2f
2021-08-30 12:36:18.855  WARN tokio-runtime-worker sync: [Relaychain] 💔 Error importing block 0x21d772b96ee22c51a7d2b54ab294ae6ad11f6ac4c22fca1c7068ffc79b1ce28f: Err(Other(ClientImport(""Import failed: Potential long-range attack: block not in finalized chain."")))
NotInFinalizedChain
2021-08-30 12:36:18.854  WARN tokio-runtime-worker sc_service::client::client: [Relaychain] Block import error:
2021-08-30 12:36:18.854  WARN tokio-runtime-worker sc_client_db: [Relaychain] Potential safety failure: reverting finalized block (9006619, 0x4c52a63e37833c9ef115a8e4245abf94b4b8922b9b516464ec3032201c781547)
2021-08-30 12:36:18.507  INFO tokio-runtime-worker substrate: [Relaychain] 💤 Idle (4 peers), best: #9006619 (0x4c52…1547), finalized #9006619 (0x4c52…1547), ⬇ 315.5kiB/s ⬆ 168.6kiB/s
2021-08-30 12:36:18.068  WARN tokio-runtime-worker sync: [Relaychain] 💔 Error importing block 0x21d772b96ee22c51a7d2b54ab294ae6ad11f6ac4c22fca1c7068ffc79b1ce28f: Err(Other(ClientImport(""Import failed: Potential long-range attack: block not in finalized chain."")))
NotInFinalizedChain
2021-08-30 12:36:18.068  WARN tokio-runtime-worker sc_service::client::client: [Relaychain] Block import error:
2021-08-30 12:36:18.068  WARN tokio-runtime-worker sc_client_db: [Relaychain] Potential safety failure: reverting finalized block (9006619, 0x4c52a63e37833c9ef115a8e4245abf94b4b8922b9b516464ec3032201c781547)
2021-08-30 12:36:17.385  WARN tokio-runtime-worker sync: [Relaychain] 💔 Error importing block 0x21d772b96ee22c51a7d2b54ab294ae6ad11f6ac4c22fca1c7068ffc79b1ce28f: Err(Other(ClientImport(""Import failed: Potential long-range attack: block not in finalized chain."")))
NotInFinalizedChain
2021-08-30 12:36:17.385  WARN tokio-runtime-worker sc_service::client::client: [Relaychain] Block import error:
2021-08-30 12:36:17.385  WARN tokio-runtime-worker sc_client_db: [Relaychain] Potential safety failure: reverting finalized block (9006619, 0x4c52a63e37833c9ef115a8e4245abf94b4b8922b9b516464ec3032201c781547)
2021-08-30 12:36:16.978  WARN tokio-runtime-worker sync: [Relaychain] 💔 Error importing block 0x21d772b96ee22c51a7d2b54ab294ae6ad11f6ac4c22fca1c7068ffc79b1ce28f: Err(Other(ClientImport(""Import failed: Potential long-range attack: block not in finalized chain."")))
NotInFinalizedChain
2021-08-30 12:36:16.978  WARN tokio-runtime-worker sc_service::client::client: [Relaychain] Block import error:
2021-08-30 12:36:16.977  WARN tokio-runtime-worker sc_client_db: [Relaychain] Potential safety failure: reverting finalized block (9006619, 0x4c52a63e37833c9ef115a8e4245abf94b4b8922b9b516464ec3032201c781547)
2021-08-30 12:36:16.464  WARN tokio-runtime-worker sync: [Relaychain] 💔 Error importing block 0x21d772b96ee22c51a7d2b54ab294ae6ad11f6ac4c22fca1c7068ffc79b1ce28f: Err(Other(ClientImport(""Import failed: Potential long-range attack: block not in finalized chain."")))
NotInFinalizedChain
2021-08-30 12:36:16.464  WARN tokio-runtime-worker sc_service::client::client: [Relaychain] Block import error:
2021-08-30 12:36:16.463  WARN tokio-runtime-worker sc_client_db: [Relaychain] Potential safety failure: reverting finalized block (9006619, 0x4c52a63e37833c9ef115a8e4245abf94b4b8922b9b516464ec3032201c781547)
2021-08-30 12:36:15.777  INFO tokio-runtime-worker sub-libp2p: [Relaychain] 🔍 Discovered new external address for our node: /ip4/10.0.1.30/tcp/30336/ws/p2p/12D3KooWQdVseJeYZSxMKFsjiFWmaftcT4aZQQaqBpot5Rxh9qiH
2021-08-30 12:36:15.777  INFO tokio-runtime-worker sub-libp2p: [Relaychain] 🔍 Discovered new external address for our node: /ip4/10.0.1.30/tcp/30335/p2p/12D3KooWQdVseJeYZSxMKFsjiFWmaftcT4aZQQaqBpot5Rxh9qiH
2021-08-30 12:36:15.444  WARN tokio-runtime-worker sync: [Relaychain] 💔 Error importing block 0x21d772b96ee22c51a7d2b54ab294ae6ad11f6ac4c22fca1c7068ffc79b1ce28f: Err(Other(ClientImport(""Import failed: Potential long-range attack: block not in finalized chain."")))
NotInFinalizedChain
2021-08-30 12:36:15.444  WARN tokio-runtime-worker sc_service::client::client: [Relaychain] Block import error:
2021-08-30 12:36:15.444  WARN tokio-runtime-worker sc_client_db: [Relaychain] Potential safety failure: reverting finalized block (9006619, 0x4c52a63e37833c9ef115a8e4245abf94b4b8922b9b516464ec3032201c781547)
2021-08-30 12:36:15.031  WARN tokio-runtime-worker sync: [Relaychain] 💔 Error importing block 0x21d772b96ee22c51a7d2b54ab294ae6ad11f6ac4c22fca1c7068ffc79b1ce28f: Err(Other(ClientImport(""Import failed: Potential long-range attack: block not in finalized chain."")))
NotInFinalizedChain
2021-08-30 12:36:15.031  WARN tokio-runtime-worker sc_service::client::client: [Relaychain] Block import error:
2021-08-30 12:36:15.030  WARN tokio-runtime-worker sc_client_db: [Relaychain] Potential safety failure: reverting finalized block (9006619, 0x4c52a63e37833c9ef115a8e4245abf94b4b8922b9b516464ec3032201c781547)
2021-08-30 12:36:14.947  INFO tokio-runtime-worker substrate: [Parachain] 💤 Idle (55 peers), best: #438150 (0x483f…3334), finalized #438149 (0x75e3…07b5), ⬇ 2.3kiB/s ⬆ 5.0kiB/s
2021-08-30 12:36:14.840  WARN tokio-runtime-worker sync: [Relaychain] 💔 Error importing block 0x21d772b96ee22c51a7d2b54ab294ae6ad11f6ac4c22fca1c7068ffc79b1ce28f: Err(Other(ClientImport(""Import failed: Potential long-range attack: block not in finalized chain."")))
NotInFinalizedChain
2021-08-30 12:36:14.840  WARN tokio-runtime-worker sc_service::client::client: [Relaychain] Block import error:
2021-08-30 12:36:14.839  WARN tokio-runtime-worker sc_client_db: [Relaychain] Potential safety failure: reverting finalized block (9006619, 0x4c52a63e37833c9ef115a8e4245abf94b4b8922b9b516464ec3032201c781547)
2021-08-30 12:36:13.696  WARN tokio-runtime-worker sync: [Relaychain] 💔 Error importing block 0x21d772b96ee22c51a7d2b54ab294ae6ad11f6ac4c22fca1c7068ffc79b1ce28f: Err(Other(ClientImport(""Import failed: Potential long-range attack: block not in finalized chain."")))
NotInFinalizedChain
2021-08-30 12:36:13.695  WARN tokio-runtime-worker sc_service::client::client: [Relaychain] Block import error:
2021-08-30 12:36:13.695  WARN tokio-runtime-worker sc_client_db: [Relaychain] Potential safety failure: reverting finalized block (9006619, 0x4c52a63e37833c9ef115a8e4245abf94b4b8922b9b516464ec3032201c781547)
2021-08-30 12:36:13.505  INFO tokio-runtime-worker substrate: [Relaychain] 💤 Idle (41 peers), best: #9006619 (0x4c52…1547), finalized #9006619 (0x4c52…1547), ⬇ 714.3kiB/s ⬆ 837.4kiB/s
2021-08-30 12:36:13.441  WARN tokio-runtime-worker sync: [Relaychain] 💔 Error importing block 0x21d772b96ee22c51a7d2b54ab294ae6ad11f6ac4c22fca1c7068ffc79b1ce28f: Err(Other(ClientImport(""Import failed: Potential long-range attack: block not in finalized chain."")))
NotInFinalizedChain
2021-08-30 12:36:13.441  WARN tokio-runtime-worker sc_service::client::client: [Relaychain] Block import error:
2021-08-30 12:36:13.440  WARN tokio-runtime-worker sc_client_db: [Relaychain] Potential safety failure: reverting finalized block (9006619, 0x4c52a63e37833c9ef115a8e4245abf94b4b8922b9b516464ec3032201c781547)
2021-08-30 12:36:13.057  WARN tokio-runtime-worker sync: [Relaychain] 💔 Error importing block 0x21d772b96ee22c51a7d2b54ab294ae6ad11f6ac4c22fca1c7068ffc79b1ce28f: Err(Other(ClientImport(""Import failed: Potential long-range attack: block not in finalized chain."")))
NotInFinalizedChain
2021-08-30 12:36:13.054  WARN tokio-runtime-worker sc_service::client::client: [Relaychain] Block import error:
2021-08-30 12:36:13.053  WARN tokio-runtime-worker sc_client_db: [Relaychain] Potential safety failure: reverting finalized block (9006619, 0x4c52a63e37833c9ef115a8e4245abf94b4b8922b9b516464ec3032201c781547)
2021-08-30 12:36:09.946  INFO tokio-runtime-worker substrate: [Parachain] 💤 Idle (55 peers), best: #438150 (0x483f…3334), finalized #438148 (0xe50f…2eb5), ⬇ 32.7kiB/s ⬆ 11.9kiB/s
2021-08-30 12:36:08.503  INFO tokio-runtime-worker substrate: [Relaychain] 💤 Idle (45 peers), best: #9006621 (0x5fe5…b9b7), finalized #9006617 (0x9c68…5383), ⬇ 344.9kiB/s ⬆ 532.7kiB/s
2021-08-30 12:36:07.214  INFO tokio-runtime-worker substrate: [Parachain] ✨ Imported #438151 (0x6e68…37f6)
2021-08-30 12:36:07.205  INFO tokio-runtime-worker substrate: [Parachain] ✨ Imported #438151 (0xcb7f…ba9c)
2021-08-30 12:36:07.122  INFO tokio-runtime-worker sc_informant: [Parachain] ♻️  Reorg on #438150,0x483f…3334 to #438150,0x9832…9fc8, common ancestor #438149,0x75e3…07b5
2021-08-30 12:36:07.122  INFO tokio-runtime-worker substrate: [Relaychain] ✨ Imported #9006621 (0x5fe5…b9b7)
2021-08-30 12:36:07.121  INFO tokio-runtime-worker sc_informant: [Relaychain] ♻️  Reorg on #9006621,0xefcf…09da to #9006621,0x5fe5…b9b7, common ancestor #9006618,0xa415…dc2f
2021-08-30 12:36:07.023  INFO tokio-runtime-worker substrate: [Relaychain] ✨ Imported #9006621 (0xefcf…09da)
2021-08-30 12:36:04.946  INFO tokio-runtime-worker substrate: [Parachain] 💤 Idle (55 peers), best: #438149 (0x75e3…07b5), finalized #438148 (0xe50f…2eb5), ⬇ 1.7kiB/s ⬆ 3.6kiB/s
2021-08-30 12:36:03.500  INFO tokio-runtime-worker substrate: [Relaychain] 💤 Idle (45 peers), best: #9006620 (0x4703…30a5), finalized #9006617 (0x9c68…5383), ⬇ 589.2kiB/s ⬆ 587.2kiB/s
2021-08-30 12:36:00.883  INFO tokio-runtime-worker substrate: [Relaychain] ✨ Imported #9006620 (0x495b…150a)
2021-08-30 12:36:00.810  INFO tokio-runtime-worker substrate: [Relaychain] ✨ Imported #9006620 (0x4703…30a5)
2021-08-30 12:36:00.810  INFO tokio-runtime-worker sc_informant: [Relaychain] ♻️  Reorg on #9006620,0xa16a…1af0 to #9006620,0x4703…30a5, common ancestor #9006618,0xa415…dc2f
2021-08-30 12:36:00.530  INFO tokio-runtime-worker substrate: [Relaychain] ✨ Imported #9006620 (0xa16a…1af0)
2021-08-30 12:35:59.946  INFO tokio-runtime-worker substrate: [Parachain] 💤 Idle (55 peers), best: #438149 (0x75e3…07b5), finalized #438147 (0x09e5…c3af), ⬇ 46.4kiB/s ⬆ 17.2kiB/s
crystalin commented 3 years ago

Extra information. Node (in that specific case collator) running with state-db-cache 4 (pretty much disabling state db cache) also had the bad block hash returned when querying it

notlesh commented 3 years ago

I ran an experiment this morning where I spun up a fresh node and allowed it to sync with only my bugged node (no other peers). The results are a bit surprising.

When I look up block 0x3c376028c4f9b1b3f3fad730e8fd0a2965869724cacd0317686f72a7dccc56cd on the two nodes, I get different results. I can actually trace this down to a single nibble being different, as this screenshot depicts (notice the b being different from the e after the highlighted d0d3):

image

EDIT: Wow, I must have stumbled across a copy/paste bug (https://github.com/neovim/neovim/issues/13431). I've actually replicated this several times, each time it ends up dropping the b mentioned previously. Upon inspection of the original data in the browser, there is no discrepancy. Sorry to mislead with this info. The following info is still relevant.

Other differences after this sync:

  1. Height 427541 shows:

Maybe relevant: my node fully synced the parachain, but hasn't caught up on the relaychain yet. I intend to let it finish syncing the relay chain and see if the results differ, but that will take some time.

notlesh commented 3 years ago

I wanted to investigate the raw database files on my disk, but I'm not sure how I can do so. I tried sst_dump, but it expects a different version:

sst_dump --file=polkadot/chains/ksmcc3/db/1936792.sst --command=scan
from [] to []
Process polkadot/chains/ksmcc3/db/1936792.sst
Sst file format: block-based
polkadot/chains/ksmcc3/db/1936792.sst: Corruption: Unknown Footer version. Maybe this file was created with newer version of RocksDB?

Is there some other tool I could use?

arkpar commented 3 years ago

This issue has been fixed in substrate for a while. Specifically in https://github.com/paritytech/substrate/pull/9403

xlc commented 3 years ago

Why #9403 is labeled as B0-silent? This is something should be back ported and have parachain nodes updated. Also how can we repair the bad nodes? Do we have to resync?

arkpar commented 3 years ago

The fix was done for cumulus specifically so I guess it should have been mentioned in the cumulus release notes.

Also how can we repair the bad nodes? Do we have to resync?

Resync or copy the database from a recently synced node.

xlc commented 3 years ago

There are no Cumulus release notes. At least not anything that I am aware of.