status-im / specs

Specifications for Status clients.
https://specs.status.im/
MIT License
14 stars 14 forks source link

Describe rate limiting in Whisper #60

Closed adambabik closed 4 years ago

adambabik commented 4 years ago

Closes #66

oskarth commented 4 years ago

Thanks for PR! I know it is just a draft, but I made some comments already

adambabik commented 4 years ago

From reading the spec, I'm assuming that rate limiting is applied only on the incoming messages (haven't checked the code). The general problem I see with this type of rate limiting is that it seems to work against the decentralized idea where anyone can start a node, due to the broadcasting nature of the Whisper protocol.

Of course, this solution is local to a cluster. The "proper" solution is PoW but we can't apply it on mobile clients.

The issue I see here is that having this distinction does not allow for anyone to spin up there own node for the network (even though there is no incentive to do this now, we should not start disconnecting them either because they are not part of the "cluster"...). Or am I misunderstanding on how this would work?

Correct. Such a new node would need to be whitelisted. It's a kind of a problem we currently have. There might be dishonest peers in the current network. The client selects only 1-2 Whisper nodes. If it selects only dishonest peers, it will receive only filtered messages. The solution we were thinking about long time ago was a smart contact where honest mailservers could be collected. They would need to stake some SNT + validators run by Status (in the simplest form) would need to exist. In the case of whitelisted peers, we would need a similar system I believe.

Generally, we talked about it in an issue and this is the most naive but working solution in the current reality. Unless there is some better idea...

oskarth commented 4 years ago

I don't understand why we need to distinguish between cluster nodes and normal nodes to apply rate limiting. Why can't it be universally applied without any kind of special treatment? Enshrining our cluster as special is a huge issue.

adambabik commented 4 years ago

@oskarth ok, so what's your proposal? How do you want to rate limit communication between server node A and server node B where the first one has, e.g. 100 peers? It means that each of 100 peers can produce 1 message per second but node A would send 100 message per second to node B. That's why Whisper implemented PoW instead of rate limiting because it's impossible to have global rate limiting allowing all scenarios.

Instead of whitelisting we can rate limit only light nodes but then it's enough for an attacker to run a node on the server and turn off light node mode and DoS all our clients.

Alternatively, we can apply rate limiting only on the Status clients (and mailservers likely). In such a case, it's still possible to spam the network but at least do not kill the clients (clients would disconnect from spamming peers). But I am not sure this fulfils ToB requirements. cc @corpetty

oskarth commented 4 years ago

@adambabik Rate limit by default for all nodes. Provide whitelisting as an option for all nodes.

Cluster whitelist their own nodes. That's an implementation details. Keep specialized trust / deployment specifics at that level, not protocol level.

oskarth commented 4 years ago

@adambabik will you update this based on decision in Monday's call? Also please add relevant issue to Waku project and replace existing placeholder.

Rate limit: Packet code feedback for rate limiting [needs more detail] - Adam

adambabik commented 4 years ago

@oskarth yes, let's confirm the decision:

  1. Rate limiting is a setting per node; each node needs to enable it and set the rate limits,
  2. Whitelisting peers and IPs is possible,
  3. A node broadcasts its rate limits (that would need to be in a handshake as well),
  4. In the case of a violation, a peer might be disconnected similarly to sending a message with invalid PoW etc.

What is specific to our server nodes, we will white list all our servers and decide on the rate limits. The client (Status app) will have different numbers and also will whitelist known peers.

adambabik commented 4 years ago

@oskarth @decanus please review again.

decanus commented 4 years ago

once this is merged we should add it to Waku, or should it already be added @oskarth

oskarth commented 4 years ago

@decanus I think we can wait until this is merged, and then add it to Waku along with implementation matrix update.

@adambabik @kdeme any more thoughs on the handshake? It seems a bit iffy now with existing layout https://github.com/vacp2p/specs/pull/52#discussion_r351096293 previous clients and now this change. Can we simplify things somehow or communicate things in a more open-ended vs positional way? How can we ensure better consistency here? Both for existing spec, this addition, and going forward (moar capabilities).

oskarth commented 4 years ago

One thing we might do is append to handshake for Whisper v6, and then clarify in spec that only unambiguous patterns are legal. Example, with a,b,c being optional fields:

required -required a

This is illegal:

Then, for waku v1, we ensure we have an extensible design that allows unambiguous optionals.

What do people think of this approach?

adambabik commented 4 years ago

This is the current situation:

EIP-627 (based on geth implementation):

(statusCode, protocolVersion[, pow][, bloom][, isLightNode])

It means that pow, bloom and isLightNode are optional. However, it's possible to skip pow and bloom and pass only isLightNode. This is odd.

Our Whisper fork:

(statusCode, protocolVersion[, pow][, bloom][, isLightNode][, confirmationsEnabled])

It means that pow, bloom, isLightNode and confirmationsEnabled are optional. However, it's possible to skip pow and bloom and pass only isLightNode or only confirmationsEnabled. Situation where both isLightNode and confirmationsEnabled are passed but not pow and bloom is also allowed.

As you can see it's very complicated and unclear. Because it's our protocol spec, I would go for:

(statusCode, protocolVersion[, pow][, bloom][, isLightNode][, confirmationsEnabled][, rateLimits])

where rateLimits is optional and can be specified independently from the rest of the optional parameters.

In Waku/1, I would go for a scheme where anything after protocolVersion is optional but if one wants to pass, for example, isLightNode then all previous arguments need to be passed as well.

kdeme commented 4 years ago

So according to EIP-627, PoW is not optional. Just bloom filter is optional. isLightNode is not in spec but is in geth / nim-eth.

For nim-eth these optional arguments will only work in one direction for now: see https://github.com/status-im/nim-eth/issues/116 . But we will need to fix this it seems (As we have confirmationsEnabled added. And now there is also rate limits and probably in the future Waku mode / topics). I'd have to further think about it but the last point of @adambabik might make sense to make it more easy. Most simple solution would be no "optionally" at all :)

kdeme commented 4 years ago

Regarding the rate limiting specification:

It looks like an improvement on the current "too low" PoW situation. Seems like it could stop a DoS from spreading over the network, or at least a cluster within the network, and still allow individual nodes.

Perhaps this is obvious, but I would like to note that it will still be possible (D)DoS a single node, as on the output node messages will get dropped unrelated to where the traffic comes from. E.g. One peer could stay under the limits with one connected node. But the node might go over the limits with multiple peers. But OK, this can get solved with whitelisting & somewhat with good configuration (or at least make it more difficult -> more IPs).

adambabik commented 4 years ago

@kdeme

So according to EIP-627, PoW is not optional. Just bloom filter is optional. isLightNode is not in spec but is in geth / nim-eth.

Where is it stated? It is optional in the implementation, just saying.

E.g. One peer could stay under the limits with one connected node. But the node might go over the limits with multiple peers.

Yeah, that's possible but rate limiting is defined per peer so if a node wants to receive max 10 messages per second and it is willing to connect to 10 peers, it may advertise the rate limiting of 1 message per second. All peers should respect that and throttle the number of sent messages. This may lead to missing some messages though.

oskarth commented 4 years ago

So according to EIP-627, PoW is not optional. Just bloom filter is optional. isLightNode is not in spec but is in geth / nim-eth.

Where is it stated? It is optional in the implementation, just saying.

From https://eips.ethereum.org/EIPS/eip-627:

This packet contains two objects: integer message code (0x00) followed by a list of values: integer version, float PoW requirement, and bloom filter, in this order. The bloom filter paramenter is optional; if it is missing or nil, the node is considered to be full node (i.e. accepts all messages). The format of PoW and bloom filter please see below (message codes 2 and 3).

It only mentions bloom filter as optional, which implies the previous fields are required.


I'm not so sure about having required optional-a optional-b where you can omit optional-a and specify optional-b, seems very fragile. What's the issue with locking down as I proposed above?

adambabik commented 4 years ago

Thanks @oskarth for that fragment. Yeah, so the implementation does not follow the spec...

I'm not so sure about having required optional-a optional-b where you can omit optional-a and specify optional-b, seems very fragile. What's the issue with locking down as I proposed above?

We either write this spec as it is or write it as it should be. The current implementation allows to omit some optional-a parameters and specify optional-b. My proposal for Waku/1 is more strict and we can lock it down this way but we would need to change the implementation and won't be compatible with EIP-627. Given that the current way is not really viable in Nim maybe we should go with the more strict one.

kdeme commented 4 years ago

I also prefer the more strict approach as suggested, it is likely to make implementation less complex.

I don't think the strict approach collides with EIP-627. The fragment @oskarth posted also mentions in this order. In my interpretation, when you omit a field you automatically lose order.

adambabik commented 4 years ago

Ok, so I think the consensus is to make the spec strict even though the current state of things in Go implementation is a bit different. In my opinion, it's a good decision without negative consequences. I will update this PR.

Thanks all for contributing to the discussion!