paritytech / polkadot-sdk

The Parity Polkadot Blockchain SDK
https://polkadot.network/
1.77k stars 634 forks source link

Node sends 1k "Duplicate gossip" msgs #5379

Open ordian opened 4 weeks ago

ordian commented 4 weeks ago

We had a spike of 1k warning messages from a single peer on Kusama: https://grafana.teleport.parity.io/goto/Vs5GETjIR?orgId=1

WARN tokio-runtime-worker peerset: Report 12D3KooWJPXURGJ1nMa3FwA8aeSZJEpXnaVX36Gyr8LQBPUmHiWz: -4 to -1987730577. Reason: Duplicate gossip. Banned, disconnecting.

where the reputation was very slowly (-4) adjusted after being banned. The node is https://apps.turboflakes.io/?chain=kusama#/validator/J11jfzJQAkxKJTZozq3SDRAsjZkSQjzoRCUQ5oVBxWsecJT?mode=history which has a low grading (F), but would still good to know the reason why "Duplicate gossip" was triggered.

cc @paritytech/networking

dmitry-markin commented 4 weeks ago

~Thanks for reporting. If it happens we see this error triggered by an offending node under our control, we should try collecting logs from it with -l gossip=trace for the period of time it misbehaves.~

Edited. See the next message of @alexggh.

alexggh commented 4 weeks ago

The logs are bit misleading because after we banned a node we print the log at every reputation update. https://github.com/paritytech/polkadot-sdk/blob/master/substrate/client/network/src/peer_store.rs#L265

So you actually have too look for the first occurence of Banned to understand the root cause, so in this case the culprit for banning was.

2024-08-16 00:42:54.458  WARN tokio-runtime-worker peerset: 
Report 12D3KooWJPXURGJ1nMa3FwA8aeSZJEpXnaVX36Gyr8LQBPUmHiWz: -536870912 to -1825819477. 
Reason: No requested block data. Banned, disconnecting.
lexnv commented 4 weeks ago

We ve seen this before indeed, also reported by some external operator.

I have this on my todo list, will publish a pr Monday probably, to make the reputation warning of a peer under the banned threshold more explicit.

As to the initial warning which caused the ban, i think we can stil look into that

ordian commented 4 weeks ago

As to the initial warning which caused the ban, i think we can stil look into that

I see 3 places this log is issued from: 2 in chain sync: https://github.com/paritytech/polkadot-sdk/blob/74267881e765a01b1c7b3114c21b80dbe7686940/substrate/client/network/sync/src/strategy/chain_sync.rs#L755-L765 and one in warp sync: https://github.com/paritytech/polkadot-sdk/blob/74267881e765a01b1c7b3114c21b80dbe7686940/substrate/client/network/sync/src/strategy/warp.rs#L433-L439

Unfortunately, we don't seem to have sync=debug logs enabled on our Kusama validators.