Closed jamesmunns closed 3 weeks ago
Maybe you are also interested to look into Introduction to Traffic Shaping Using HAProxy which have similar features as nginx.
Traffic shaping is available since https://www.haproxy.com/blog/announcing-haproxy-2-7
@git001 thanks for the pointer, I'll check it out!
An additional detail that was surfaced during discussion was that there are two main "orientations" for rate limiting:
This could maybe be implemented in a way that is transparent: If we use the proposed formatting key for the matching, this could be independently matched against, for example if we have three "rules":
src
-> The source IP of the downstream requestoruri
-> The request URI pathdst
-> The selected upstream pathIn this case, I think that all connections would need to obtain a token from ALL THREE to proceed.
So lets say that downstream 10.0.0.1
wants to request for /api/whatever
, and upstream 192.0.0.1
is selected as the upstream.
For each of these keys, we'd need to:
My initial thought for this is to use something like the leaky_bucket
crate, and place the rate limiting table in something like:
// things that can be rate limiting keys
enum Key {
Ipv4(IpV4Addr),
Format(String),
...
}
struct LimiterPayload {
// Some kind of record of last use? For culling?
last_used: AtomicU64,
// The actual rate limiter
limiter: RateLimiter,
}
struct Context {
limiter_map: RwLock<BTreeMap<Key, Arc<LimiterPayload>>>,
}
The behavior would have to be something like:
let mut limiters = vec![];
for rule in self.rules.iter() {
let key = generate_key(&request_context);
// This probably could optimistically get a read lock, then upgrade to a write lock
// if the key does not exist
let limiter = service_context.get_or_create(key);
limiters.push(limiter);
}
let mut pending_tokens = vec![];
for limiter in limiters {
match limiter.get_token() {
Ok(None) => {}, // immediately allowed
Ok(Some(fut)) => {
// Not immediately allowed
pending_tokens.push(fut);
}
Err(e) => {
// too full, immediately return error
return Err(400);
}
}
}
// futures unordered or something
pending_tokens.join_all().await;
Ok(...)
But this is immediately problematic:
leaky_bucket
crate has no concept of "immediate fail" - it always enqueues all requests. This could be okay, but not what I wrote aboveIt also feels like a lot of locking and allocation. This is probably unavoidable, but it makes me a little nervous about being a DoS vector.
Closing this as completed in #67
As part of the current milestone, we'd like to add basic rate limiting options.
https://blog.nginx.org/blog/rate-limiting-nginx is a good basic read on how NGINX does this.
There's a couple of concepts we need to define:
$src_ip/$uri
or somethingrate
,burst
, anddelay
terms from nginx!Unlike NGINX, I don't currently think we need to consider the "zone" or "scope" of the rules - I currently intend for rate limiting to be per-service, which means that the "zone" is essentially the tuple of
(service, key)
I have checked with @eaufavor, and the correct way to handle delayed requests in pingora is to have the task yield (for a timer, or activation queue, etc).