rule110-io / surge

Surge is a p2p filesharing app designed to utilize blockchain technologies to enable 100% anonymous file transfers. Surge is end-to-end encrypted, decentralized and open source.
https://getsurge.io
Apache License 2.0
428 stars 50 forks source link

Use last connected nodes as seed nodes to solve RPC timeout #55

Closed yilunzhang closed 3 years ago

yilunzhang commented 3 years ago

Some areas in the world (e.g. China) has really unstable connections to the seed nodes. To be more specific, a lot of RPC call will timeout randomly, and NKN clients will sometimes fail to create because creating a client relies on one RPC call to the seed node. We've tried all major VPS (AWS, GCP, DO, etc) and nothing works out reliably. This makes most of the errors we have seen during our Surge tests in China.

The solution we came up with for nMobile is to save last connected nodes. When creating clients next time, load and pass those nodes as seed nodes, together with the official seed node as back up. This is by far the most reliable and decentralized solution we have tried, and can eliminate most RPC timeout.

The current SDK already has all API needed, the implementation should be pretty straightforward:

Save nodes:

  1. Use https://pkg.go.dev/github.com/nknorg/nkn-sdk-go#MultiClient.GetClients to get all clients of a multiclient
  2. Use https://pkg.go.dev/github.com/nknorg/nkn-sdk-go#Client.GetNode to get the node of each client
  3. Save the RPCAddr of each node

Use nodes:

  1. Load all saved RPCAddr
  2. When creating multiclient, pass a ClientConfig with SeedRPCServerAddr set to all saved RPCAddr (you probably need to add http:// prefix), with an additional http://seed.nkn.org:30003 at the end as fallback

Make sure you are using the latest SDK version (v1.3.5+) so the RPC is trying seed node in order rather than random.

MutsiMutsi commented 3 years ago

Hi @yilunzhang , followed the instructions, my implementation https://github.com/rule110-io/surge/commit/a850bbc21b7bc8befe95d0bbe80438f365aa9460

yilunzhang commented 3 years ago

Looks great! Now there is no single point of failure after first launch! :)