dominant-strategies / go-quai

Official Go Implementation of the Quai Network
GNU General Public License v3.0
2.36k stars 457 forks source link

Majority of LibP2P requests fail under load #1799

Closed mibuono closed 1 month ago

mibuono commented 1 month ago

Majority Libp2p Requests don’t get a response under load

This causes syncing to not work and in some cases nodes that get out of sync dont come back

Steps to reproduce

There are two ways. 1) Run a network without transactions but drop half the requests, so that the request manager gets activated and used 2) Run a network with transactions and under high load of 1000tps+ you can observe this Or try to sync against the garden test which has that tps

You will see lots of requests timing out and not going through as well and lots of empty messages

gameofpointers commented 1 month ago
WARNING[05-29|16:54:30.637] Peer did not respond in time                  peerId
=12D3KooWRQrLVEeJtfyKoJDYWYjryBKR8qxkDooMMzyf2ZpLaZRR requestID=3710988829
ERROR  [05-29|16:54:30.637] Error requesting the data from peer           err="p
eer did not respond in time" peerId=12D3KooWRQrLVEeJtfyKoJDYWYjryBKR8qxkDooMMzyf
2ZpLaZRR topic=0xcc765ce0d79736950aeded81e32bbd55a03c4dcec236c1944da32609a237fad
c/0/blocks

Some of the errors seen trying to request

ERROR  [05-29|16:57:04.987] unsupported quai message type                 quaiMsg=
ERROR  [05-29|16:57:04.987] unsupported quai message type                 quaiMsg=
ERROR  [05-29|16:57:05.005] unsupported quai message type                 quaiMsg=
ERROR  [05-29|16:57:05.005] unsupported quai message type                 quaiMsg=
wizeguyy commented 1 month ago

This is blocked waiting on the following information: 1) logs containing request path instrumentation prints @gameofpointers @mechanikalk 2) commit hash of commit which previously was able to sync @mechanikalk

Also blocked by bootstrapping errors which prevent my node from joining the network: https://github.com/dominant-strategies/go-quai/issues/1805

wizeguyy commented 1 month ago

According to @mechanikalk, 31bd59b was able to sync

gameofpointers commented 1 month ago

@mibuono @wizeguyy I am taking this ticket over, i worked on it yesterday and i think have found the root cause and fixed it