status-im / status-go

The Status module that consumes go-ethereum
https://status.im
Mozilla Public License 2.0
728 stars 246 forks source link

Handle 500+ concurrent MailServer users #950

Closed adambabik closed 5 years ago

adambabik commented 6 years ago

Problem

We need to scale to 500+ concurrent users (P1 | 2.3 | Beta is launched successfully | P2 | Cluster can handle 500 concurrent users). As it may be possible to set 500 as maxPeers limit per node, it's very risky approach.

Solution

The proposed solution is to manage MailServers in status-go (have more than one mail server in rotation) and free the connection after some inactivity time.

A temporary, quick and dirty solution may be to set maxPeers for a single mail server to 500.

Acceptance Criteria

  1. 500+ concurrent users can be connected to our cluster and use mail servers.

Notes

First, confirm that it's possible to set maxPeers limit to 500 in a single mail server and connect that many peers. If so, decide if this is sufficient solution for now and moving the logic to status-go can be done after the release.

adambabik commented 6 years ago

There was already an attempt to move MailServers to status-go (https://github.com/status-im/status-go/pull/700) but the PR was closed.

oskarth commented 6 years ago

What about providing N mail servers in client and use sticky random based on user's pubkey/UUID or something to do the selection? This would statistically give same properties as "connect to different peer if one server runs out of slots" but without coordination.

Assuming 500+ concurrent users and, say, 25 maxPeer we can spin up 40 mailservers. Seems likely load will be way less though. What's current DAU and mailserver distribution of peers?

connect only for the duration of the mail request

Sounds like a good idea to me too.

oskarth commented 6 years ago

Re putting logic in status-go: we can do this if it is only way to solve the problem, but ideally we leave it in the client.

The constraint is the following:

Users providing individual mailservers/masternodes and paying/getting paid for it is a core use case straight from the white paper. The HA for individual mail servers is also an interesting one, but for a later discussion. I.e. ideally user could pay for N nodes and each node doesn't have to be HA, but they still get their messages. This would enable distribution of SNT through e.g. Desktop. But this is a different discussion, not relevant now.

For future: Perhaps there's some way we can have the best of both worlds, where status-react and status-go can communicate about which peers should be used?

adambabik commented 6 years ago

What about providing N mail servers in client and use sticky random based on user's pubkey/UUID or something to do the selection?

Exactly, that's the idea + disconnecting if a mail server is not used for some time to avoid unnecessary occupation of connection slots. This is especially important for Desktop client when such a client can connect to a single Mail Server for hours while not using it at all.

Assuming 500+ concurrent users and, say, 25 maxPeer we can spin up 40 mailservers. Seems likely load will be way less though. What's current DAU and mailserver distribution of peers?

We can safely set maxPeers to 100 or 200 or even more. I was testing with 100 without any issue.

With regards to the constraint, it does not matter when we implement it because the client can pass the selected MailServer to status-go and that's it. status-go can implement this logic. It's something to debate and compare pros and cons.

adambabik commented 6 years ago

For the closed beta, the easiest solution to implement would be to run a few mail servers and use an algorithm that deterministically selects one based on the user's public key.

oskarth commented 6 years ago

Sounds great, let's do this, will create issue in status-react

oskarth commented 6 years ago

https://github.com/status-im/status-react/issues/4282

Do we have an issue for disconnecting? Would this happen on status-go or status-react?

We also need to spin up a bunch of mail servers and embed them in app. 5-10 maybe?

adambabik commented 6 years ago

Do we have an issue for disconnecting? Would this happen on status-go or status-react?

@oskarth no, but it makes sense to keep that logic in the same place so in status-react.

We also need to spin up a bunch of mail servers and embed them in app. 5-10 maybe?

Let's make sure it works with 2 servers and it is covered with unit tests :) Then we will run a few Mail Servers. We need them to have the same properties (static private key, IP, port) as bootnodes.

adambabik commented 6 years ago

Update: we have 3 mail servers each having 200 slots. Due to in cluster connections, we have around 17% of slots taken which equals around 100 which means about 500 is free. We can bump maxPeer to 250 if that's not enough, we can change Whisper limits for mail servers (not it's 3,5) and finally, we can add another mail server.

status-github-bot[bot] commented 6 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

status-github-bot[bot] commented 5 years ago

This issue has been automatically closed. Please re-open if this issue is important to you.