DMDcoin / diamond-node

bit.diamonds node software for network version 4
GNU General Public License v3.0
0 stars 4 forks source link

Is syncing state with forks #127

Open SurfingNerd opened 4 weeks ago

SurfingNerd commented 4 weeks ago

we encountered problems with the consensus engine if it is on a fork, but not on the longest chain, for example after a rollback. we must not do any validator actions when we are syncing, just because a validator back in time for the history he was syncing.

here is an example log output:

2024-10-15 11:11:07  Worker Client0 INFO import  Syncing  #397863 0xc2e7…80c9     0.00 blk/s    0.0 tx/s    0.0 Mgas/s      0+    0 Qed LI:#397863   23/27 peers   65 MiB chain 0 bytes queue  RPC:  0 conn,    0 req/s,    0 µs
2024-10-15 11:11:12  Worker Client2 INFO import  Syncing  #397863 0xc2e7…80c9     0.00 blk/s    0.0 tx/s    0.0 Mgas/s      0+    0 Qed LI:#397863   23/27 peers   65 MiB chain 0 bytes queue  RPC:  0 conn,    0 req/s,    0 µs
2024-10-15 11:11:13  Worker Client2 DEBUG consensus  Block creation: Batch received for epoch 397864, total 0 contributions, with 0 unique transactions.
2024-10-15 11:11:13  Worker Client2 DEBUG engine  added to additional 0 reserved peers, because they are pending validators.
2024-10-15 11:11:13  Worker Client2 DEBUG engine  skipping sending key gen transaction, because we are syncing
2024-10-15 11:11:17  Worker Client1 INFO import  Syncing  #397864 0x8ce7…b0f9     0.00 blk/s    0.0 tx/s    0.0 Mgas/s      0+    0 Qed LI:#397863   23/27 peers   65 MiB chain 0 bytes queue  RPC:  0 conn,    0 req/s,    0 µs
2024-10-15 11:11:22  Worker Client2 INFO import  Syncing  #397864 0x8ce7…b0f9     0.00 blk/s    0.0 tx/s    0.0 Mgas/s      0+    0 Qed LI:#397863   23/27 peers   65 MiB chain 0 bytes queue  RPC:  0 conn,    0 req/s,    0 µs
2024-10-15 11:11:27  Worker Client3 INFO import  Syncing  #397864 0x8ce7…b0f9     0.00 blk/s    0.0 tx/s    0.0 Mgas/s      0+    0 Qed LI:#397863   23/27 peers   65 MiB chain 0 bytes queue  RPC:  0 conn,    0 req/s,    0 µs
2024-10-15 11:11:32  Worker Client1 INFO import  Syncing  #397864 0x8ce7…b0f9     0.00 blk/s    0.0 tx/s    0.0 Mgas/s      0+    0 Qed LI:#397863   23/27 peers   65 MiB chain 0 bytes queue  RPC:  0 conn,    0 req/s,    0 µs
2024-10-15 11:11:37  Worker Client2 INFO import  Syncing  #397864 0x8ce7…b0f9     0.00 blk/s    0.0 tx/s    0.0 Mgas/s      0+    0 Qed LI:#397863   23/27 peers   65 MiB chain 0 bytes queue  RPC:  0 conn,    0 req/s,    0 µs

this node will get set disabled, because it wont send its key gen transactions.

IMHO it looks like that the tracking of Nodes that do not deliver valuable information does not work in a proper way. addressing this issue could fix a lot of problems connected.

overall sync status was Blocks in this case.

 # HELP sync_status WaitingPeers(0), SnapshotManifest(1), SnapshotData(2), SnapshotWaiting(3), Blocks(4), Idle(5), Waiting(6), NewBlocks(7)
# TYPE sync_status gauge
sync_status 4

sync would need to get rewritten anyway for https://github.com/DMDcoin/diamond-node/issues/111

SurfingNerd commented 3 weeks ago

also define a workflow for hardforks and node software versions (ignore outdated software versions)

SurfingNerd commented 3 weeks ago

disabling problematic peers was removed with commit 238605e12d88d761c612be01ae5892246a0f72da . in some situations, peers that caused a problem are the peers we require for HBBFT communication. since the implementation of reserved peers management, the engine now knows what kind of peers are important.

SurfingNerd commented 2 weeks ago

Todo: find out if the "is syncing" state that never imports a block always has a Stage X Verification error before (usually stage 3 or stage 5)