status-im / nim-chronos

Chronos - An efficient library for asynchronous programming
https://status-im.github.io/nim-chronos/
Apache License 2.0
362 stars 51 forks source link

Rate limiter #263

Closed Menduist closed 2 years ago

Menduist commented 2 years ago

Adds a generic rate limiter:

var counter = RateCounter.init(cap=1.seconds)
echo counter.tryConsume(10.operationsPerSecond) # succeeds
await counter.consume(operationsPerSecond(0.1)) # Will block for ~9 seconds
mratsim commented 2 years ago

In nimbus for example, we'll likely want to switch to having two of these buckets: a smaller bursty one and a global more "even" one - when other clients ask for blocks, we want each peer to get an allocation but also put a global limit on how much bandwidth we assign to providing blocks.

That said, what the "tokens" are is arbitrary - it's usually bytes, but sometimes packets or requests - there are arguments for each, depending on what's costly.

In terms of modules we have:

I think for now we can focus on discovery/libp2p/Nimbus. If we start adding bandwidth limits we need metrics on bandwidth and messages sent by each module so that it's easier understanding the steady state vs an under-attack scenario.

arnetheduck commented 2 years ago

If we start adding bandwidth limits we need metrics on bandwidth and messages sent by each module so that it's easier understanding the steady state vs an under-attack scenario.

sure - a bandwidth limit though is nothing other than using tokens == bytes - I'm merely outlining the future use cases here so that the chronos support works for as many as possible.

arnetheduck commented 2 years ago

await bucket.consume(tokensPerSecond(2000, 1000))

Well, this reads a bit wrong: you're not "consuming tokens per second" - you're consuming tokens once - the per-second part makes the API unintuitive to read.

await bucket.consume(operationsPerSecond(10))

This one is even more problematic: operations can't be "consumed" really - it's the tokens that get consumed in order to let the operations proceed.

share the bucket between multiple applications using different systems

why would you want to do that? these systems either would need to know about each other so as to use the same "scale" for their token consumption (which is arbitrary), in which case customizing the scale per-consumer is not needed - if they don't know about each other, their scales will be different, and then they can't meaningfully share a limiter afaict?

Menduist commented 2 years ago

if they don't know about each other, their scales will be different, and then they can't meaningfully share a limiter afaict?

I think they can, because you can always express a cost as a percent of a budget, so you can scale different costs similarly My current wording is maybe not clear enough, maybe it's clearer this way:

let gossipSubCost = operationsPerSecond(10) # Allow 10 OPS for gossipsub
let pingCost = operationsPerSecond(30) # Allow 30 OPS for ping
proc syncCost(bytesUsed: int): auto = tokensPerSecond(bytesUsed, 1_000_000) # Allow 1_000_000 bytes / second

You can now share the same bucket:

let bucket = RateCounter.init(cap=1.seconds)

# we received a GS connection
bucket.consume(gossipSubCost)

# we received a ping
bucket.consume(pingCost)

# we received a sync request
bucket.consume(syncCost(response.len))
arnetheduck commented 2 years ago

cost as a percent of a budget

"percent" is a scale too :) the rate limiter could prescribe that rate budgets have to be expressed in percentages, but that doesn't make it any more meaningful or "shareable": a bytes-per-second rate and a cpu-ticks rate or an i/o disk rate have nothing in common, and expressing them as "fractions" only complicates usage of the rate limiter - I think this is generally why the classic literature on the topic expresses rate limiting as generic tokens being "produced" at a given rate, with the bucket being able to hold up to a certain amount and consumers being able to take tokens out of it according to those parameters - it's a useful analogy that intuitively covers both bursts and overall average rate making it easy to reason about, and it is this analogy that I think is valuable to preserve in the wording of the API usage.

The "burstiness" in particular is important to tune for something like eth2 - we want consumers to generally be able to grab a few epochs of data without constraints, but once they hit that limit we want to go back to fairness overall.

Also, I think it's common to run multiple overlapping rate limiters for a single resource, as far as use cases go: you typically want to have a per-connection limit as well as a group and/or global limit for bandwidth for example: we want to allow a total of 100 blocks / sec to be uploaded to peers (to limit total bandwidth consumption by the req/resp system) but we also want to limit individual peers such that one doesn't monopolise that pool.

I can't think of many use cases where I would want to share rate limiters and why this is a primary use case which should drive the API design - what would be common examples of such sharing?

The discussion is a little bit analogous to peer scoring: there doesn't exist a globally meaningful measure of "score" for peers - rather, the score encodes domain-specific knowledge and while it would be possible to constrain it to a range, doing so would make it less intuitive / expressive and harder to reason about while not providing much benefit.

Menduist commented 2 years ago

Closed in favor of #279 (a regular token bucket)