Open dmitry-markin opened 11 months ago
It doesn't seem that this is happening due to 1. or 2. Validation slots are always deallocated once validations complete here: https://github.com/paritytech/polkadot-sdk/blob/21b32849db4ac5f284d7ca2f93927b0d35bb783a/substrate/client/network/sync/src/block_announce_validator.rs#L297 and per-peer slot deallocation is tested here: https://github.com/paritytech/polkadot-sdk/blob/21b32849db4ac5f284d7ca2f93927b0d35bb783a/substrate/client/network/sync/src/block_announce_validator.rs#L359-L383
Also, assumption 2. is unlikely, because validations are always polled until the stream is pending (the stream never terminates): https://github.com/paritytech/polkadot-sdk/blob/21b32849db4ac5f284d7ca2f93927b0d35bb783a/substrate/client/network/sync/src/engine.rs#L910-L912.
It looks like the issue might be caused by major syncing peers announcing imported blocks. By default, we always announce imported blocks:
https://github.com/paritytech/polkadot-sdk/blob/8ce16ee6ff190912995dc6c8da85b741eaada395/substrate/client/service/src/lib.rs#L184-L186
announce_imported_blocks
:
https://github.com/paritytech/polkadot-sdk/blob/8ce16ee6ff190912995dc6c8da85b741eaada395/substrate/client/cli/src/config.rs#L431-L436
There is no "fuse" in SyncingEngine
making block announcement a no-op during major sync either: https://github.com/paritytech/polkadot-sdk/blob/8ce16ee6ff190912995dc6c8da85b741eaada395/substrate/client/network/sync/src/engine.rs#L634-L673
When a peer is major syncing, the announcements might be broadcast faster then they are processed, what triggers the warning.
When we are major syncing the node internally isn't sending block import notification and thus, we also don't announce anything.
https://github.com/paritytech/polkadot-sdk/pull/2132 helped with polling block announce validations during versi scaling, but if the issue described in https://github.com/paritytech/polkadot-sdk/issues/2152 continues, we'll need to look into it again.
Adding logs for an event that might be related. I brought up a RPC node for Frequency and received a large number of block imports. Nothing special was happening on the nodes at the time. No other details / errors / warnings were indicated. @wilwade believed this issue may be related.
Thanks @senseless Here are the logs for one of the archive nodes we run that picked up on the error Ignored-Block-Logs.txt
Frequency is currently using Polkadot SDK v1.1.0 So possible that this has been fixed and we just need to update, but wanted to make sure the logs were here in case.
Also getting 1000 of them in < 1 second feels odd.
We should probably stop logging this, as this could be literally someone spamming the node.
We should probably stop logging this, as this could be literally someone spamming the node.
Addressed in https://github.com/paritytech/polkadot-sdk/pull/4480.
As reported by @altonen while running a burn-in for https://github.com/paritytech/polkadot-sdk/pull/1631, some block announcements are dropped with a warning: https://github.com/paritytech/polkadot-sdk/blob/1cf7d3aafa0e7edaa20d34581e0f1656895cb9c2/substrate/client/network/sync/src/block_announce_validator.rs#L163
This can be happening for several reasons:
MAX_CONCURRENT_BLOCK_ANNOUNCE_VALIDATIONS_PER_PEER
are currently set to4
).