Closed edsko closed 7 months ago
@kazu-yamamoto , you asked a difficult question :sweat_smile:
If you think 4 is too small, you can increase the value.
I didn't change the default as I wasn't sure what to change it to, and I wasn't comfortable suggesting something. But since you asked, I spent some time trying to see what other people do. Unfortunately, it is surprisingly difficult to find some concrete guidelines about this. Plenty of sites that tell you "one way to mitigate a ping flood attack is to impose rate limits", but what those rate limits should be, nobody seems to want to say.
A summary of what I found:
It seems most applications impose rate limits in a different manner than http2
currently does. Some examples:
Limit on the maximum number of control frames per connection (rather than per second). Examples:
max_http2_control_frames_per_connection
, which seems to have been adopted verbatim also by some other suppliers (for example, VMWare). Defaults to 1,000. nginx
hardcodes hardcodes this limit to a maximum of 10,000 (patch that introduces the limit, location in the source code)Compare overall traffic to traffic on the individual streams.
nginx
: "as long as total traffic is many times larger than stream traffic, we consider this to be a flood" (patch). Limit outbound control packets (the response to the ping, as opposed to the ping itself).
Impose a minimum time between control packets
Most relevant for my own needs, since I am working on a gRPC library: the gRPC Keepalive guide defines a server parameter PERMIT_KEEPALIVE_TIME
which sets the "minimum allowed time between a server receiving successive ping frames without sending any data/header frame" and defaults to 5 minutes -- so this is some hybrid of (2) and (4).
For actual rate limiting, I only found two suggestions:
HTTP/2
specific, but Hillstone imposes a rate limit on all ICMP packets, defaulting to a rather liberal 1500/second.I suspect that the reason I am seeing more pings than the 4/second that http2
currently allows is somehow related to the rate of requests (for example -- speculating -- perhaps some logic is a bit broken and new requests can cause a new keepalive ping to be sent, irrespective of when the previous one was sent). If true, that would mean that any specific value of pings/second is potentially problematic. I'd have to start digging through the golong source of the server to know for sure, but it at least feels plausible as a client "broken" in this way would still work with the recommendation from the guide on rate limiting (option (5), above) -- even if it doesn't quite explain everything, at least not when compared to the guidelines (but that is no surprise in the gRPC ecosystem, unfortunately) -- it certainly doesn't seem to match the documented behaviour of the golang
client :shrug:
I'm not sure what to suggest. Quite frankly, I made the ping rate limit a parameter in my library too, defaulting to 10/second for now, but of course that is basically passing the bucket further down the line to the application :grin: I agree with you that sensible defaults would be better.
Option (5) from the gRPC guidelines -- that is, imposing a minimum time between successive ping frames without sending data/header frames -- does seem quite sensible, not just for gRPC but also for other applications. If you want, I can try to implement that and see if it works for me, although of course that would be a bigger change to http2
than the current PR.
Alternatively, I could imagine some kind of pluggable "rate limiting API" which could be instantiated to any of the 5 options above; I'd have to spend some time thinking about what such an API should look like.
Let me know what you'd like me to do.
@edsko Thank you for this research!
Perhaps, it is easy for many programmers to measure time between packets and to count the number of a specific frame but it is hard to implement rate limiting (number of packets per second). It took some days for me to hit upon the idea on how to implement this rate limiting. Yes, it looks very easy if you read the code but it would be hard to implement it without reference code.
I'm not sure it's worth exploring other approaches. I would suggest to just increase the value from 4 to 10 with the configurable scheme.
Ok, I've discussed this with my client and we're okay with this approach. I've force-pushed a commit that changes the default to 10 (and still keeps it configurable).
Also changed the test suite so that the ping protection test is a bit more aggressive so that it still exceeds the limit.
Merged. Thank you for your contributions! A new version has been released.
Thanks @kazu-yamamoto !
I'm dealing with some gRPC servers (written in golang) that routine exceed the limit of 4 pings/sec (increasing the limit to 10 seems to do the trick). This PR makes the ping rate limit configurable.