ontio / ontology

Official Go implementation of the Ontology protocol. https://dev-docs.ont.io/#/
GNU Lesser General Public License v3.0
829 stars 290 forks source link

Ontology Testnet not syncing past block 16613129 #1410

Open juliajang opened 2 years ago

juliajang commented 2 years ago

What version of ontology are you using (ontology --version)?

v2.3.7

Does this issue reproduce with the latest release?

Yes

What operating system are you using (such as macOS, Linux and Windows)?

Linux

What did you do?

Started up a testnet validator that was syncing but stopped at block 16613129.

What did you expect to see?

Expected to the validator continue syncing to the top of the chain (current at around 16.8 million)

What did you see instead?

Error (level: WARN) message: saveBlock Height:16613130 AddBlock error:saveBlock error state merkle root mismatch. expected: d20081eb949e3d9e5c57347c0a5b3e083173821032519d215dfa57ed1a61a283, got: bb847547558081ae2929f9f376cd9d4bd20ba652f903950a768ee44138dc8ba7

Latest logs:

2022/06/13 19:06:09.286355 [INFO ] GID 2600, [block-sync] delete node: 9eddec6d3b4a0214cf453d10075fe9beffc98d00
2022/06/13 19:06:09.286369 [WARN ] GID 2600, [block-sync] saveBlock Height:16613130 AddBlock error:saveBlock error state merkle root mismatch. expected: d20081eb949e3d9e5c57347c0a5b3e083173821032519d215dfa57ed1a61a283, got: bb847547558081ae2929f9f376cd9d4bd20ba652f903950a768ee44138dc8ba7
2022/06/13 19:06:10.475977 [INFO ] GID 204, server 4294967295 statemgr update, current state: 1, from peer: 9, peercnt: 7, v1: 3524, v2: 3499
2022/06/13 19:06:11.430156 [INFO ] GID 411, [p2p]error read from 54.234.1.252:20338 :EOF
2022/06/13 19:06:11.430213 [INFO ] GID 411, remove peer 000000000000000000000000eb3bcf45d8f11514 from net server
2022/06/13 19:06:11.430251 [INFO ] GID 411, closing connection: peer 000000000000000000000000eb3bcf45d8f11514, address: 54.234.1.252:20338
2022/06/13 19:06:11.430332 [INFO ] GID 2633, [block-sync] delete node: 000000000000000000000000eb3bcf45d8f11514
2022/06/13 19:06:11.430436 [INFO ] GID 2632, inbound peer 54.234.1.252:46922 connected, id=000000000000000000000000eb3bcf45d8f11514, version=v1.9.0-0-g1d4e4f57
2022/06/13 19:06:11.894141 [INFO ] GID 204, server 4294967295 statemgr update, current state: 1, from peer: 7, peercnt: 7, v1: 3524, v2: 3499
2022/06/13 19:06:12.349278 [INFO ] GID 298, CurrentBlockHeight = 16613129
2022/06/13 19:06:17.835968 [INFO ] GID 204, server 4294967295 statemgr update, current state: 1, from peer: 3, peercnt: 7, v1: 3524, v2: 3499
2022/06/13 19:06:18.343362 [INFO ] GID 204, server 4294967295 statemgr update, current state: 1, from peer: 5, peercnt: 7, v1: 3524, v2: 3499
2022/06/13 19:06:18.349275 [INFO ] GID 298, CurrentBlockHeight = 16613129
2022/06/13 19:06:19.835990 [INFO ] GID 204, server 4294967295 statemgr update, current state: 1, from peer: 1, peercnt: 7, v1: 3524, v2: 3499
2022/06/13 19:06:21.555548 [INFO ] GID 204, server 4294967295 statemgr update, current state: 1, from peer: 6, peercnt: 7, v1: 3524, v2: 3499
2022/06/13 19:06:21.585138 [INFO ] GID 204, server 4294967295 statemgr update, current state: 1, from peer: 4, peercnt: 7, v1: 3524, v2: 3499
2022/06/13 19:06:24.349246 [INFO ] GID 298, CurrentBlockHeight = 16613129
2022/06/13 19:06:30.349290 [INFO ] GID 298, CurrentBlockHeight = 16613129
2022/06/13 19:06:30.476282 [INFO ] GID 204, server 4294967295 statemgr update, current state: 1, from peer: 9, peercnt: 7, v1: 3524, v2: 3499
2022/06/13 19:06:31.894060 [INFO ] GID 204, server 4294967295 statemgr update, current state: 1, from peer: 7, peercnt: 7, v1: 3524, v2: 3499
2022/06/13 19:06:36.349284 [INFO ] GID 298, CurrentBlockHeight = 16613129
2022/06/13 19:06:37.836099 [INFO ] GID 204, server 4294967295 statemgr update, current state: 1, from peer: 3, peercnt: 7, v1: 3524, v2: 3499
2022/06/13 19:06:38.343475 [INFO ] GID 204, server 4294967295 statemgr update, current state: 1, from peer: 5, peercnt: 7, v1: 3524, v2: 3499
2022/06/13 19:06:39.836049 [INFO ] GID 204, server 4294967295 statemgr update, current state: 1, from peer: 1, peercnt: 7, v1: 3524, v2: 3499
2022/06/13 19:06:41.555446 [INFO ] GID 204, server 4294967295 statemgr update, current state: 1, from peer: 6, peercnt: 7, v1: 3524, v2: 3499
2022/06/13 19:06:41.585306 [INFO ] GID 204, server 4294967295 statemgr update, current state: 1, from peer: 4, peercnt: 7, v1: 3524, v2: 3499
2022/06/13 19:06:42.349285 [INFO ] GID 298, CurrentBlockHeight = 16613129
2022/06/13 19:06:45.708095 [INFO ] GID 612, server 4294967295 received consensus msg, blk 16861191, type: 0 from 5
2022/06/13 19:06:45.711400 [INFO ] GID 509, server 4294967295 received consensus msg, blk 16861191, type: 1 from 3
2022/06/13 19:06:45.790831 [INFO ] GID 1269, server 4294967295 received consensus msg, blk 16861191, type: 1 from 9
2022/06/13 19:06:45.792457 [INFO ] GID 1269, server 4294967295 received consensus msg, blk 16861191, type: 2 from 9
...

Additional context

The validator was started at a much lower block height and after syncing with the network for several days it stopped gaining additional block height at block 16613129 with the error (see logs) above. My observability dashboard shows that the validator stopped gaining block height late 6/10, early 6/11.

laizy commented 2 years ago

thanks for your report, can you try use v2.4.0-alpha?

laizy commented 2 years ago

@juliajang we use v2.3.7 to sync blocks from 16611150 but can not reproduce, maybe your node's database is corrupted. Logs:

2022/06/14 09:01:46.626672 [INFO ] GID 1, Ledger init success
2022/06/14 09:01:46.628233 [INFO ] GID 1, tx pool: the current local gas price is 2500
2022/06/14 09:01:46.628771 [INFO ] GID 1, TxPool init success
2022/06/14 09:01:46.628877 [INFO ] GID 11, txpool actor started and be ready to receive txPool msg
2022/06/14 09:01:57.715923 [INFO ] GID 1, [p2p] init peer ID to 44467e27850baf386b73aebf5d282975bf13a263
2022/06/14 09:01:57.716065 [INFO ] GID 1, [p2p]start listen on sync port 20338
2022/06/14 09:01:57.716092 [INFO ] GID 1, [p2p]Wait for minimum connection...
2022/06/14 09:01:57.716109 [INFO ] GID 1, P2P init success
2022/06/14 09:01:57.721598 [INFO ] GID 1, Rpc init success
2022/06/14 09:01:57.727336 [INFO ] GID 1, Eth Rpc init success
2022/06/14 09:01:58.089450 [INFO ] GID 92, outbound peer 45.43.63.117:20338 connected. id=dc17915ae6aad2792ea687360e80f70cdbaa79b5, version=v2.4.0-0-ge70d1c4
2022/06/14 09:01:58.090247 [INFO ] GID 95, outbound peer 128.1.38.193:20338 connected. id=66f107feb62b43b5229d3eea44f770f5b7c21abc, version=v2.4.0-0-ge70d1c4
2022/06/14 09:01:58.519331 [INFO ] GID 94, outbound peer 49.51.52.200:20338 connected. id=9c80a161417a17789929981503010eb2e6ab6a4f, version=v2.4.0-0-ge70d1c4
2022/06/14 09:01:58.524231 [INFO ] GID 93, outbound peer 49.51.35.159:20338 connected. id=932d3e34d183fb76e1648d2a318c8d859decd126, version=v2.4.0-0-ge70d1c4
2022/06/14 09:01:58.717028 [INFO ] GID 108, Header sync request height:16611150
2022/06/14 09:02:00.182258 [INFO ] GID 111, Header receive height:16611150 - 16611649
2022/06/14 09:02:00.359693 [INFO ] GID 111, Header sync request height:16611650
2022/06/14 09:02:01.564293 [INFO ] GID 681, Header receive height:16611650 - 16612149
2022/06/14 09:02:01.748660 [INFO ] GID 681, Header sync request height:16612150
2022/06/14 09:02:02.814302 [INFO ] GID 1085, Header receive height:16612150 - 16612649
2022/06/14 09:02:03.003959 [INFO ] GID 1085, Header sync request height:16612650
2022/06/14 09:02:03.727803 [INFO ] GID 98, CurrentBlockHeight = 16612140
2022/06/14 09:02:04.458781 [INFO ] GID 1510, Header receive height:16612650 - 16613149
2022/06/14 09:02:04.679785 [INFO ] GID 1510, Header sync request height:16613150
2022/06/14 09:02:05.004198 [INFO ] GID 1419, outbound peer 18.117.152.118:20338 connected. id=9b8d738c12a728e5e2dfb2c49d34cae2dfcb916e, version=v2.3.7-0-g67f36a3
2022/06/14 09:02:06.046503 [INFO ] GID 2209, Header receive height:16613150 - 16613649
2022/06/14 09:02:06.226053 [INFO ] GID 2209, Header sync request height:16613650
2022/06/14 09:02:08.135022 [INFO ] GID 3314, outbound peer 43.156.12.30:20338 connected. id=e38ee788689593c7d088f5505cb4c651b126dda0, version=v1.11.0-154-gdfecc908
2022/06/14 09:02:08.479747 [INFO ] GID 3315, outbound peer 63.33.70.25:20338 connected. id=00000000000000000000000066475028a2ebbaad, version=v1.9.0-0-g1d4e4f57
2022/06/14 09:02:09.727532 [INFO ] GID 98, CurrentBlockHeight = 16613649
2022/06/14 09:02:10.022312 [INFO ] GID 2913, Header receive height:16613650 - 16614149
2022/06/14 09:02:10.202090 [INFO ] GID 2913, Header sync request height:16614150
2022/06/14 09:02:10.672115 [INFO ] GID 3349, Header receive height:16613650 - 16614149
2022/06/14 09:02:13.311526 [INFO ] GID 3823, Header receive height:16614150 - 16614649
2022/06/14 09:02:13.490349 [INFO ] GID 3823, Header sync request height:16614650
2022/06/14 09:02:14.136014 [INFO ] GID 3972, Header receive height:16614150 - 16614649
2022/06/14 09:02:14.444185 [INFO ] GID 87, Ontology received exit signal: interrupt.
2022/06/14 09:02:14.444288 [INFO ] GID 87, closing ledger...
juliajang commented 2 years ago

Hi again, so I've restarted the validator several times from 0 and still get this issue 😢