RateLimitPolicy v2 - Githubissues

maleck13 commented 1 year ago

@alexsnaps @guicassolato one concern I have is that (assuming we want to stick with policy attachment) adding that to the API, even if we went only with default for now would be another new version of the API. Are we content to accept that as where we are?

guicassolato commented 1 year ago

From the design it is assumed (maybe I am wrong) that the rate limiting service (Limitador) will always be called. I miss some way to specify when I do not want to call Limitador. It is usually based on HTTP method, HTTP path and domain name.

That would be the union of all the selectors. For every selector in the policy, we add a "configuration rule" (how it was previously called). IOW, if you have a selector based on a value (e.g. the HTTP method) and that value is present in the proxy before it triggers the RL filter (always the case for the HTTP method), then the proxy will set a descriptor with the value. This is a way to avoid setting rules to trigger the RL filter based on values that are never actually used in the policy, at the same time it validates that what's used in the policy makes sense in the context.

guicassolato commented 1 year ago

Envoy RLS proto allows hits_addend field to specify the "weight" (I would call it hits_addend) https://www.envoyproxy.io/docs/envoy/latest/api-v3/service/ratelimit/v3/rls.proto.html. That can be implemented in the wasm-shim 👍

I did not know about that. Thanks for pointing out!

maleck13 commented 1 year ago

I prefer C with second choice A

guicassolato commented 1 year ago

I miss some way to specify when I do not want to call Limitador. It is usually based on HTTP method, HTTP path and domain name.

At least in my head that would be a separate HTTPRoute without a RLP attached.

guicassolato commented 1 year ago

How would be the spec for the following use case? I want rate limiting of X request per minute (or whatever duration) if the request has the header "X-TO-BE-RL", no matter what is the value of the header?. Or the opposite, no rate limit if the request has the header "X-NOT-TO-BE-RL", no matter what is the value of the header.

It looks like another operator of a condition selector to me, @eguzki.


conditions:
  - selector: context.request.http.headers.x-to-br-rl
    operator: exists

alexsnaps commented 1 year ago

There is something about that whole identity thing that… puzzles me, i.e. HTTPRouteRule _of a HTTPRoute is I think really what we're targeting with a limit I think, but this leverages HTTPRouteMatch…

spec.limits.increment! Wouldn't that break with the hashing (i.e. identifying the limit) based of the match, rather than the rule? You wouldn't be able to implement it so the delta for GET is 1 while for POST it'd be 5 on the same limit and as such counters would you?

guicassolato commented 1 year ago

spec.limits.increment! Wouldn't that break with the hashing (i.e. identifying the limit) based of the match, rather than the rule? You wouldn't be able to implement it so the delta for GET is 1 while for POST it'd be 5 on the same limit and as such counters would you?

@eguzki asked me the same question today.

I think it's a different use case, one that I did not think of. My idea for increment was a way to control the rate without affecting the counter. The increment does not (and should not!) be part of the limit's identity.

guicassolato commented 1 year ago

The hashing of the HTTPRouteMatch needs to account for the RLP itself… otherwise there might be cases where two RLPs are the "same" limit as far as limitador is concerned.

The hashing is not an identifier of the limit; it's an identifier of the set of HTTPRouteMatches that should trigger that limit.

alexsnaps commented 1 year ago

@guicassolato see this here and more specifically this bit of code wrt "identity" of Limits in limitador.

guicassolato commented 1 year ago

This has been discussed in the Kuadrant tech call yesterday, but I'll leave here a comment for the record. It's about targeting individual HTTPRouteMatches vs targeting a whole HTTPRouteRule.

I think only in a world of 1:1 relation between matching rules (HTTPRouteMatch) and backends (backendRef) we'd be able enforce targeting atomic HTTPRouteRules. Unfortunately I don't think that's how people naturally write routing rules. (Maybe it's me?) I think people tend to group matching rules by backend.

Maybe it's a matter of educating people, or establishing a new "best practice"? By declaring highly granular HTTPRouteRules, to the point of 1 HTTPRouteRule == 1 HTTPRouteMatch, it might be more verbose, but it gives more flexibility, less coupling.

Coupling is also a very important aspect at play here. People should not have to write HTTPRoutes based (only or ever) on how they will then define their RLPs. Otherwise one thing is coupled to the other. Plus, add KAP (and other types of policies) here and an organisation of an HTTPRoute once thought for RL is now broken.

Besides, the RLP API must allow one to write limits that target multiple HTTPRouteRules as a way to share a counter across different backends. (Or perhaps there are other ways?)

Add there two requirements together, i.e. how people write HTTPRoutes (translated to the necessity of occasionally targeting partial rules) and possibility to target more than one rule/backend, and targeting sets of HTTPRouteMatches hopefully becomes more acceptable and something we can live with for now. Or if people do move towards HTTPRouteRule == 1 HTTPRouteMatch, then it won't matter anyway.

eguzki commented 1 year ago

IMO a RLP should target a HTTPRouteRule. It makes easier for me to reason about rate limiting and auth. It is about colouring routes.

If the RLP targeted a match, the scenario would be much more complicated, kind of "multicolour" routes using the color analogy.

guicassolato commented 1 year ago

IMO a RLP should target a HTTPRouteRule. It makes easier for me to reason about rate limiting and auth. It is about colouring routes.

If the RLP targeted a match, the scenario would be much more complicated, kind of "multicolour" routes using the color analogy.

@eguzki, unless users write 1 HTTPRouteMatch per HTTPRouteRule, what you suggest would lead to coupling. And if users write 1 HTTPRouteMatch per HTTPRouteRule, then targeting one kind or the other semantically means the exact same thing. Therefore, enforcing atomic targeting of HTTPRouteRules means forcing users into adopting the 1 HTTPRouteMatch == 1 HTTPRouteRule pattern or they will be coupling HTTPRoutes to RLPs.

eguzki commented 1 year ago

will be coupling HTTPRoutes to RLPs.

I do not see this. I see rate limiting assigned (colouring) a given route. Just like Envoy's API for rate limiting. The rate limit config is at the route level, not at the matcher level.

guicassolato commented 1 year ago

Coupling HTTPRoutes to RLPs or 1 HTTPRouteRule == 1 HTTPRouteMatch pattern is a logical conclusion.

I guess what you're saying is that Envoy's RL API does not allow us to implement the proposal due to not offering RL at the level of the match, but one level higher. This is a good point.

In Example 2, those two independent gateway actions are only possible because of the wasm shim, but using Envoy's RL API, it wouldn't exist one rule for GET /toys* and another for POST /toys* for us to inject the two separate sets of descriptor action configurations, simply because in the HTTPRoute it's defined as one single HTTPRouteRule.

I can agree with that. Then the only way out (without going "all-in" with our wasm shim to solve this) is by forcing users to move toward 1 HTTPRouteRule == 1 HTTPRouteMatch.

We change this proposal to atomic targeting of HTTPRouteRules, users will be forced to occasionally redefine their HTTPRouteRules within their HTTPRoutes to be able to attach limits to them, then again for other types of policies, until they realise it's better breaking HTTPRouteRules to no more than one HTTPRouteMatch each so policies can be more fine-grained.

eguzki commented 1 year ago

lise it's better breaking HTTPRouteRules to no more than one HTTPRouteMatch each so policies can b

I am struggling to understand this. Just to clarify: By no means I want to force GwAPI users to 1 HTTPRouteRule == 1 HTTPRouteMatch.

What I am proposing is that rates, counters and when apply to entire route.. not for every route where there is a given match.

Maybe matches should be renamed to route_matches.

guicassolato commented 1 year ago

I am struggling to understand this. Just to clarify: By no means I want to force GwAPI users to 1 HTTPRouteRule == 1 HTTPRouteMatch.

What I am proposing is that rates, counters and when apply to entire route.. not for every route where there is a given match.

Let's run a didactic example. By induction, we start with the simplest HTTPRoute with no matches. (Non-canonical representation to save a few lines of code here.)

kind: HTTPRoute
spec:
  rules:
  - backendRefs: [toystore] # /

Now we want a RLP of 100rps on method POST. First, we edit the HTTPRoute:

kind: HTTPRoute
spec:
  rules:
  - backendRefs: [toystore] # /
  - backendRefs: [toystore] # POST /
    matches:
    - method: POST

Then we can write the RLP:

kind: RateLimitPolicy
spec:
  limits:
  - rules:
    - matches: # or "first HTTPRouteRule whose entire set of matches is identical to this"
      - method: POST
    rates:
    - limit: 100
      duration: second

Now we want 500rps for method GET. Here we go again with another HTTPRouteRule in the HTTPRoute:

kind: HTTPRoute
spec:
  rules:
  - backendRefs: [toystore] # /
  - backendRefs: [toystore] # POST /
    matches:
    - method: POST
  - backendRefs: [toystore] # GET /
    matches:
    - method: GET

And edited RLP:

kind: RateLimitPolicy
spec:
  limits:
  - rules:
    - matches:
      - method: POST
    rates:
    - limit: 100
      duration: second
  - rules:
    - matches:
      - method: GET
    rates:
    - limit: 500
      duration: second

Let's keep going... 50rps for path /expensive, only methods GET, POST, PUT.

HTTPRoute:

kind: HTTPRoute
spec:
  rules:
  - backendRefs: [toystore] # /
  - backendRefs: [toystore] # POST /
    matches:
    - method: POST
  - backendRefs: [toystore] # GET /
    matches:
    - method: GET
  - backendRefs: [toystore] # GET|POST|PUT /expensive
    matches:
    - path:
        type: PathPrefix
        value: /expensive
      method: GET
    - path:
        type: PathPrefix
        value: /expensive
      method: POST
    - path:
        type: PathPrefix
        value: /expensive
      method: PUT

RLP:

kind: RateLimitPolicy
spec:
  limits:
  - rules:
    - matches:
      - method: POST
    rates:
    - limit: 100
      duration: second
  - rules:
    - matches:
      - method: GET
    rates:
    - limit: 500
      duration: second
  - rules:
    - matches:
      - path:
          type: PathPrefix
          value: /expensive
        method: GET
      - path:
          type: PathPrefix
          value: /expensive
        method: POST
      - path:
          type: PathPrefix
          value: /expensive
        method: PUT
    rates:
    - limit: 50
      duration: second

Hopefully it's getting clear at this point how the HTTPRoute is being shaped according to the needs of the RLP.

Now we want auth (KAP) for /expensive, methods POST, PUT and DELETE.

We cannot add DELETE to the GET|POST|PUT /expensive HTTPRouteRule because that would require extending the 50rps limit beyond desired (i.e. only methods GET, POST and PUT).

We cannot target the GET|POST|PUT /expensive HTTPRouteRule in the KAP because that would extend the auth rule to method the GET method.

Solution: we break the GET|POST|PUT /expensive HTTPRouteRule into two, one only for GET and another for POST|PUT, and we add a new HTTPRouteRule for DELETE. We make the RLP target 2 rules (GET /expensive and POST|PUT /expensive) and we make the KAP target also 2 rules (POST|PUT /expensive and DELETE /expensive).

HTTPRoute:

kind: HTTPRoute
spec:
  rules:
  - backendRefs: [toystore] # /
  - backendRefs: [toystore] # POST /
    matches:
    - method: POST
  - backendRefs: [toystore] # GET /
    matches:
    - method: GET
  - backendRefs: [toystore] # GET /expensive
    matches:
    - path:
        type: PathPrefix
        value: /expensive
      method: GET
  - backendRefs: [toystore] # POST|PUT /expensive
    matches:
    - path:
        type: PathPrefix
        value: /expensive
      method: POST
    - path:
        type: PathPrefix
        value: /expensive
      method: PUT
  - backendRefs: [toystore] # DELETE /expensive
    matches:
    - path:
        type: PathPrefix
        value: /expensive
      method: DELETE

RLP:

kind: RateLimitPolicy
spec:
  limits:
  - rules:
    - matches:
      - method: POST
    rates:
    - limit: 100
      duration: second
  - rules:
    - matches:
      - method: GET
    rates:
    - limit: 500
      duration: second
  - rules:
    - matches:
      - path:
          type: PathPrefix
          value: /expensive
        method: GET
    - matches:
      - path:
          type: PathPrefix
          value: /expensive
        method: POST
      - path:
          type: PathPrefix
          value: /expensive
        method: PUT
    rates:
    - limit: 50
      duration: second

KAP:

kind: AuthPolicy
spec:
  auth:
  - rules:
    - matches:
      - path:
          type: PathPrefix
          value: /expensive
        method: POST
      - path:
          type: PathPrefix
          value: /expensive
        method: PUT
    - matches:
      - path:
          type: PathPrefix
          value: /expensive
        method: DELETE
    rates:
    - limit: 50
      duration: second

What if there's another policy for POST|DELETE /expensive?

eguzki commented 1 year ago

I think I see what you mean. Thanks for the more elaborated example @guicassolato

it's getting clear at this point how the HTTPRoute is being shaped according to the needs of the RLP.

I would like to avoid that by design. Routing is one thing and another is rate limiting.

HAving a single route and multiple limit objects can be accomplished with specific when fields. Right?

In the example

apiVersion: kuadrant.io/v2beta1
kind: RateLimitPolicy
metadata:
  name: toystore-per-endpoint-per-user
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: toystore
  limits:
    - name: readers
      matches: 
        - path:
            type: PathPrefix
            value: "/toys"
          method: GET
        - path:
            type: PathPrefix
            value: "/toys"
          method: POST
      rates:
        - limit: 50
          duration: 1
          unit: minute
      counters:
        - auth.identity.username
      when:
        - selector:  context.request.http.method
          operator: eq
          value: GET
       - selector:  context.request.http.path
          operator: eq
          value: /toys
        - selector: auth.identity.group
          operator: neq
          value: admin
    - name: writers
      matches:
        - path:
            type: PathPrefix
            value: "/toys"
          method: GET
        - path:
            type: PathPrefix
            value: "/toys"
          method: POST
      rates:
        - limit: 5
          duration: 1
          unit: minute
        - limit: 100
          duration: 12
          unit: hour
      counters:
        - auth.identity.username
      when:
        - selector:  context.request.http.method
          operator: eq
          value: POST
       - selector:  context.request.http.path
          operator: eq
          value: /toys
        - selector: auth.identity.group
          operator: neq
          value: admin

Is that good enough? not too much DRY?

alexsnaps commented 1 year ago

We are yet again reopening the same can of worms, but... tl;dr:

I'm arguing for having routing route reflect the policies' topology, yes;
We can decouple them, but we end up with the set intersection problem - which makes it almost impossible to reason about defaults & overrides;
We can partially decouple them, which I thought was the current agreement, so that a Policy can partially apply to a single HTTPRouteMatch within an HTTPRouteRule, so that you would "only" have to have the matcher (which are OR'ed within an HTTPRouteRule) for e.g. the methods in other to match individual methods with distinct policies within a single HTTPRouteRule - but that way, with ordering as precedence, it'd be clear with part of the policy (e.g. a limit) applies to say GETs.

So yes, this is verbose. Requires tweaking HTTPRouteRules to accommodate for particular use-cases, even as they arise (e.g. the DDoS), which means possible "talking" between roles (e.g. cluster admin and application devs)... This is far from perfect... but, and let me reiterate - I am not sarcastic, I really mean it - we can absolutely shelve this again while someone works on an alternate proposal. While this remains the "best" first step I could come up with, it does by no mean mean that there isn't a better approach staring us in the face. I'd be more than happy if that'd be the case.

So with the 3rd bullet point, the routing becomes: ```yaml kind: HTTPRoute spec: rules: - backendRefs: [toystore] matches: - method: POST - method: GET - backendRefs: [toystore] matches: - path: type: PathPrefix value: /expensive method: GET - path: type: PathPrefix value: /expensive method: POST - path: type: PathPrefix value: /expensive method: PUT - path: type: PathPrefix value: /expensive method: DELETE ``` You can now model the different policies using these _individual_ matchers and have different RLP for each of the 4 methods on `/expensive`, while using yet different ones for `/`. But _not_ for `/foobar`.

guicassolato commented 1 year ago

Routing is one thing and another is rate limiting.

I think the reasons why we attach RL (or auth) to routes and route rules are:

Because it's easier to reason to the user about what's going on if we can point to specific routes/route rules – e.g. "hey, you requested this limit at this route of yours, it's applied"; and the more we can say "it's applied only to this route and nothing more, nothing less", the better. (It's not 100% possible, but, to an extent, it is.)
- All this information needs to go somehow in the status block of the RLP; it's better if can do that with the fewer level of extra details as possible. E.g. "it's route X but only when it matches condition Y"... And then there's another limit also to "route X" but subcondition "Z". Of course we can say, "limit 1 for route X when condition Y" and "limit 2 for route X when condition Z", but we hardly say something about "route X but not condition Y nor Z"... And these are only 2 limits. Arguably, in a complex RLP, to report all cases to the user, it's easier to tell the user to go read read the policy and figure out by themselves what's being applied.
- We ourselves will need to be able to work those matches out later to solve for defaults and overrides, which I know is a different topic, but might have its roots in what we decide here.
Because we want to reduce the overhead of having different languages to express effectively the same thing: "when X matches, apply Y". Reducing this overhead means less work to the user when writing the policy, when reading the policy, when keeping consistency between targeted network resource and the policy.
Because, wasm shim excluded, the routes are what it's given to us to configure triggers for the RL service and the ext-authz service. It's only reasonable that we make sure those route rules exist.

HAving a single route and multiple limit objects can be accomplished with specific when fields. Right?

Now you're driving us in the opposite direction than 1 HTTPRouteRule == 1 HTTPRouteMatch. Taken to the limit, we could have just the simplest HTTPRoute with no matches at all and solve everything with when conditions.

In fact, that's exactly what would happen – again exemplifying by induction – if you add to your example above a KAP for POST|DELETE /toys with a strict constraint by the user saying "I will not break my HTTPRoute into finer HTTPRouteRules any further". Well, then, except the HTTPRouteRules needed for different backends, the user is probably in a better place with no HTTPRouteRules at all in the first place! Without them, we let all requests go to Limitador and to Authorino and let these ones decide based on "soft" conditions. Users can figure out by themselves what's being applied, and we work out defaults and overrides some other way to figure out if it's supposed to be the same route/same target or not.

To sum up, I think neither you're proposing only using when conditions for filtering, nor I'm saying we should get rid of them. I'm just saying that targeting HTTPRouteRules means that, as a user, you will often be faced with the tradeoff between quality status reporting, less language translation overhead, reasonable defaults and overrides on one hand and whatever benefit you can get from the immutability of your HTTPRoutes on the other. The more you lean to the former, the more you drive in the direction of 1 HTTPRouteRule, 1 HTTPRouteMatch.

Kuadrant / architecture

RateLimitPolicy v2 #8