memorysafety / river

This repository is the home of the River reverse proxy application, based on the pingora library from Cloudflare.
https://www.memorysafety.org/initiative/reverse-proxy/
Apache License 2.0
1.7k stars 98 forks source link

Notes for design: Rate Limiting #59

Closed jamesmunns closed 3 weeks ago

jamesmunns commented 1 month ago

As part of the current milestone, we'd like to add basic rate limiting options.

https://blog.nginx.org/blog/rate-limiting-nginx is a good basic read on how NGINX does this.

There's a couple of concepts we need to define:

Unlike NGINX, I don't currently think we need to consider the "zone" or "scope" of the rules - I currently intend for rate limiting to be per-service, which means that the "zone" is essentially the tuple of (service, key)

I have checked with @eaufavor, and the correct way to handle delayed requests in pingora is to have the task yield (for a timer, or activation queue, etc).

git001 commented 1 month ago

Maybe you are also interested to look into Introduction to Traffic Shaping Using HAProxy which have similar features as nginx.

Traffic shaping is available since https://www.haproxy.com/blog/announcing-haproxy-2-7

jamesmunns commented 1 month ago

@git001 thanks for the pointer, I'll check it out!

jamesmunns commented 1 month ago

An additional detail that was surfaced during discussion was that there are two main "orientations" for rate limiting:

This could maybe be implemented in a way that is transparent: If we use the proposed formatting key for the matching, this could be independently matched against, for example if we have three "rules":

In this case, I think that all connections would need to obtain a token from ALL THREE to proceed.

So lets say that downstream 10.0.0.1 wants to request for /api/whatever, and upstream 192.0.0.1 is selected as the upstream.

For each of these keys, we'd need to:

My initial thought for this is to use something like the leaky_bucket crate, and place the rate limiting table in something like:

// things that can be rate limiting keys
enum Key {
    Ipv4(IpV4Addr),
    Format(String),
    ...
}

struct LimiterPayload {
    // Some kind of record of last use? For culling?
    last_used: AtomicU64,
    // The actual rate limiter
    limiter: RateLimiter,
}

struct Context {
    limiter_map: RwLock<BTreeMap<Key, Arc<LimiterPayload>>>,
}

The behavior would have to be something like:

let mut limiters = vec![];
for rule in self.rules.iter() {
    let key = generate_key(&request_context);
    // This probably could optimistically get a read lock, then upgrade to a write lock
    // if the key does not exist
    let limiter = service_context.get_or_create(key);
    limiters.push(limiter);
}

let mut pending_tokens = vec![];
for limiter in limiters {
    match limiter.get_token() {
        Ok(None) => {}, // immediately allowed
        Ok(Some(fut)) => {
            // Not immediately allowed
            pending_tokens.push(fut);
        }
        Err(e) => {
            // too full, immediately return error
            return Err(400);
        }
    }
}

// futures unordered or something
pending_tokens.join_all().await;

Ok(...)

But this is immediately problematic:

It also feels like a lot of locking and allocation. This is probably unavoidable, but it makes me a little nervous about being a DoS vector.

jamesmunns commented 3 weeks ago

Closing this as completed in #67