envoyproxy / ratelimit

Go/gRPC service designed to enable generic rate limit scenarios from different types of applications.
Apache License 2.0
2.21k stars 428 forks source link

Post-request Reporting of `hits_addend` #636

Open austince opened 1 week ago

austince commented 1 week ago

When implementing general quota usage metering and limiting of I/O applications (like databases), the cost of each request is not known ahead of time. Adding the ability to report the request's usage (hits_addend) after the request has been processed would allow more flexible rate-limiting use cases such as these.

I don't know if it would make sense to modify the RLS v3 API, but could essentially extend it in this service with:

service RateLimitService {
  // ... other methods

  // Report Usage
  rpc ReportUsage(RateLimitRequest) returns (RateLimitResponse) {
  }
}

Request -> Server Server -> Rate Limit Service (ShouldRateLimit hits_addend = 0) Server -> DB (calculate cost) Server -> Rate Limit Service (ReportUsage hits_addend = cost) Server -> Response based on RateLimitResponse

Using the same Request and Response objects would allow also the caller to choose whether to enforce the limit consistently or trade performance and report the usage async of sending the response.

If something like a standard header for specifying hits (https://github.com/envoyproxy/envoy/pull/34184) gets added to envoy, we could also consider upstreaming this to the RLS API and supporting the same header on responses for reporting cost.

austince commented 1 week ago

Actually, I don't think we need another method for this other than ShouldRateLimit(..), if it were possible to call it after the request is processed, though that would have to be an envoy proposal... will take it there if this idea makes any sense/pending feedback.

Request -> Envoy Enovy -> Rate Limit Service (ShouldRateLimit hits_addend = 0) Enovy -> Backend (report hits_addend in response header) Envoy -> Rate Limit Service (ShouldRateLimit hits_addend = cost) [optionally, send after response] Envoy -> Response

If all done in Envoy, there may be some kind of "local acceptable margin of error/overage" we could encode as well 🤔