ChainSafe / js-libp2p-gossipsub

TypeScript implementation of Gossipsub
Apache License 2.0
145 stars 43 forks source link

Missed responded IWANT messages from Nimbus to Lodestar #411

Closed twoeths closed 1 year ago

twoeths commented 1 year ago

Description

Lodestar sent IWANT messages to Nimbus:

grep -e "99dc255dadfc877ea825eb606de7c964890534ca" -rn beacon-2023-03-03.log
16884817:Mar-03 06:42:49.533[network]         debug: 2023-03-03T06:42:49.533Z libp2p:gossipsub IHAVE: Asking from peers 16Uiu2HAmThKa44DYZKvYi52z5gKyAkdLydMrEewrYQHGn9Ez64uS message id ... 99dc255dadfc877ea825eb606de7c964890534ca ... 

Nimbus saw the message in 400ms and responded that

2023-03-03 06:42:49.944+00:00: IWANT with 909 messages

however after 12s it becomes a broken promise

debug: 2023-03-03T06:43:01.628Z libp2p:gossipsub peer %s didn't follow up in %s IWANT requests; adding penalty 16Uiu2HAmThKa44DYZKvYi52z5gKyAkdLydMrEewrYQHGn9Ez64uS 99dc255dadfc877ea825eb606de7c964890534ca

==> messages are missed from Nimbus to Lodestar somehow

twoeths commented 1 year ago

same Lodestar's log but for other messages around that time

grep -e "5b251f8a725d5cbe72b052c2ab4eeb4cd85db3ab" -rn beacon-2023-03-03.log
16891569:Mar-03 06:42:59.517[network]         debug: 2023-03-03T06:42:59.517Z libp2p:gossipsub IHAVE: Asking from peers 16Uiu2HAmThKa44DYZKvYi52z5gKyAkdLydMrEewrYQHGn9Ez64uS message id ... 5b251f8a725d5cbe72b052c2ab4eeb4cd85db3ab ... 
16899265:Mar-03 06:43:12.109[network]         debug: 2023-03-03T06:43:12.109Z libp2p:gossipsub peer %s didn't follow up in %s IWANT requests; adding penalty 16Uiu2HAmThKa44DYZKvYi52z5gKyAkdLydMrEewrYQHGn9Ez64uS 5b251f8a725d5cbe72b052c2ab4eeb4cd85db3ab

Nimbus's log (received Iwant message in 300ms)

023-03-03 06:42:59.853+00:00: IWANT with 751 messages

Another message

grep -e "adf3afdd3efaa9e41025a124a85f30c52e5eb2ab" -rn beacon-2023-03-03.log
16892399:Mar-03 06:43:01.466[network]         debug: 2023-03-03T06:43:01.466Z libp2p:gossipsub IHAVE: Asking from peers 16Uiu2HAmThKa44DYZKvYi52z5gKyAkdLydMrEewrYQHGn9Ez64uS message id ... adf3afdd3efaa9e41025a124a85f30c52e5eb2ab ...
16899922:Mar-03 06:43:13.509[network]         debug: 2023-03-03T06:43:13.509Z libp2p:gossipsub peer %s didn't follow up in %s IWANT requests; adding penalty 16Uiu2HAmThKa44DYZKvYi52z5gKyAkdLydMrEewrYQHGn9Ez64uS adf3afdd3efaa9e41025a124a85f30c52e5eb2ab

Nimbus's log (received Iwant message in 200ms)

2023-03-03 06:43:01.683+00:00: IWANT with 785 messages
twoeths commented 1 year ago

there are a lot of errors due to msgIdToStrFn at lodestar side

Mar-04 00:00:00.833[network]         ^[[34mdebug^[[39m: 2023-03-04T00:00:00.833Z libp2p:gossipsub TypeError: buf.hexSlice is not a function
    at Object.slice (node:buffer:675:37)
    at Uint8Array.toString (node:buffer:818:14)
    at Eth2Gossipsub.msgIdToStrFn (file:///usr/src/lodestar/packages/beacon-node/src/network/gossip/encoding.ts:31:36)
    at file:///usr/src/lodestar/node_modules/@chainsafe/libp2p-gossipsub/src/index.ts:1335:31
    at Array.forEach (<anonymous>)
    at file:///usr/src/lodestar/node_modules/@chainsafe/libp2p-gossipsub/src/index.ts:1334:18
    at Array.forEach (<anonymous>)
    at Eth2Gossipsub.handleIHave (file:///usr/src/lodestar/node_modules/@chainsafe/libp2p-gossipsub/src/index.ts:1327:11)
    at Eth2Gossipsub.handleControlMessage (file:///usr/src/lodestar/node_modules/@chainsafe/libp2p-gossipsub/src/index.ts:1245:43)
    at Eth2Gossipsub.handleReceivedRpc (file:///usr/src/lodestar/node_modules/@chainsafe/libp2p-gossipsub/src/index.ts:1030:18)
twoeths commented 1 year ago
Mar-13 03:14:08.880[network]         ^[[34mdebug^[[39m: 2023-03-13T03:14:08.880Z libp2p:gossipsub IHAVE: Asking for 77 out of 77 messages from 16Uiu2HAmThKa44DYZKvYi52z5gKyAkdLydMrEewrYQHGn9Ez64uS
…
12636000:Mar-13 03:14:21.087[network]         debug: 2023-03-13T03:14:21.087Z libp2p:gossipsub peer %s didn't follow up in %s IWANT requests; adding penalty 16Uiu2HAmThKa44DYZKvYi52z5gKyAkdLydMrEewrYQHGn9Ez64uS 0xa8bd252079f77ca3dcaad37191f201a9723ce0d3

Mar_14_Nimbus_only_responded_25_first_message_ids (1).txt

Mar-13 03:14:08.883[network]         ^[[34mdebug^[[39m: 2023-03-13T03:14:08.883Z libp2p:gossipsub rpc.from 16Uiu2HAmThKa44DYZKvYi52z5gKyAkdLydMrEewrYQHGn9Ez64uS subscriptions 0 messages 25 ihave 0 iwant 0 graft 0 prune 0

right now Nimbus only responds 25 first message ids for an Iwant rpc, not a lodestar issue

twoeths commented 1 year ago

There is no issue with Nimbus for a while so closing this issue