Post-request Reporting of `hits_addend`

When implementing general quota usage metering and limiting of I/O applications (like databases), the cost of each request is not known ahead of time. Adding the ability to report the request's usage (hits_addend) after the request has been processed would allow more flexible rate-limiting use cases such as these.

I don't know if it would make sense to modify the RLS v3 API, but could essentially extend it in this service with:

service RateLimitService {
  // ... other methods

  // Report Usage
  rpc ReportUsage(RateLimitRequest) returns (RateLimitResponse) {
  }
}

Request -> Server Server -> Rate Limit Service (ShouldRateLimit hits_addend = 0) Server -> DB (calculate cost) Server -> Rate Limit Service (ReportUsage hits_addend = cost) Server -> Response based on RateLimitResponse

Using the same Request and Response objects would allow also the caller to choose whether to enforce the limit consistently or trade performance and report the usage async of sending the response.

If something like a standard header for specifying hits (https://github.com/envoyproxy/envoy/pull/34184) gets added to envoy, we could also consider upstreaming this to the RLS API and supporting the same header on responses for reporting cost.

envoyproxy / ratelimit

Post-request Reporting of `hits_addend` #636