Open anacrolix opened 1 week ago
This issue can be fixed by using the p2p.sync.onlyreqtostatic
flag introduced here.
I wondered where that static code came from. You'll be pleased to learn the PR mitigates the need for the flag.
Yeah the ultimate goal is the same: to find trusted nodes to sync, either manually by p2p.sync.onlyreqtostatic
, or automatically with your change :)
op-node will request gaps between the current head and the L2 unsafe head using the req resp (request-response) "alt sync" protocol if the blocks don't arrive via gossip. When there are network or service issues that cause stalls in gossip for more than roughly a minute, gossip will be rejected or not contain blocks needed to catch up, and nodes will enter a pathological cycle of being unable to obtain the blocks they need if most of the peers they are connected to also don't have the blocks. This is particularly bad when the sequencer becomes unavailable, because it will continue to produce blocks despite no other nodes being connected. When connectivity is resumed, all other nodes will be behind.
In the req resp arrangement, "client" is the requester, and "server" is the one receiving the message. The current req resp algorithm randomly requests blocks from peers, and has several undesirable properties: