Kuadrant / limitador-operator

Apache License 2.0
6 stars 13 forks source link

Limits at Limitador CR #21

Closed eguzki closed 2 years ago

eguzki commented 2 years ago

Context

After a redefinition of the authority of the limits, https://github.com/Kuadrant/limitador/issues/74, the limits configuration will live as a local file in the pod. This issue is about the control plane of limitador and how the limits get their way to the local file in the pod.

Source of the config map

Limitador operator will reconcile a configmap to be mounted as local file in all the replica pods of limitador. Where are those limits coming from? Currently the limitador's operator reads RateLimit CRs and reconciles with limitador using the HTTP endpoint. The association of RateLimit CR with a limitador instance is currently hardcoded in the limitador operator. The RateLimit CRs need to be created in the same namespace of the limitador's pod and the service name and the port are hardcoded.

In order to make the limits configuration flexible and with a clear association of which limits are applied to which limitador instances, the proposal is about setting the limits to the Limitador CRD. For example:

---                                                       
apiVersion: limitador.kuadrant.io/v1alpha1                
kind: Limitador                                           
metadata:                                                 
  name: limitador                                         
spec:                                                     
  replicas: 1                                             
  version: "0.4.0"     
  limits:
  - conditions: ["get-toy == yes"]
    max_value: 2
    namespace: toystore-app
    seconds: 30
    variables: []
  - conditions:
    - "admin == yes"
    max_value: 2
    namespace: toystore-app
    seconds: 30
    variables: []
  - conditions: ["vhaction == yes"]
    max_value: 6
    namespace: toystore-app
    seconds: 30
    variables: []                                                                                        

the limitador's operator would be responsible of reconciling the content of spec.limits with the content config map mounted in the limitador's pod.

Kuadrant context

Kuadrant users define their limits in the kuadrant API: RateLimitPolicy.

Kuadrant installation owns a Limitador deployment with at least one pod running. This limitador deployment is managed via a Limitador CR. The namespace and names of this Limitador CR is known by the kuadrant controller (kuadrant control plane).

Thus, when a user creates a RateLimitPolicy and adds some limits in it, the following happens behind the scenes.

a) The kuadrant-controller reads the RLP and reconciles the limits with the list in the owned Limitador CR following kuadrant rules. As an example for those rules, the namespace will be set by kuadrant and not exposed in the RLP. When one limit is added/updated/removed in the RLP, that limit is added/updated/removed from the Limitador CR.

b) The limitador's operator will reconcile the limits in the Limitador CR with a configmap that gets mounted in the deployment as a local file for the limitador process. The limitador's operator gets notifications when the Limtador CR changes. When one limit is added/updated/removed in the Limitador's CR, that limit is added/updated/removed from the config/map. That effectively changes the content of the local file.

cc @alexsnaps @didierofrivia @rahulanand16nov

guicassolato commented 2 years ago

My 2¢ here, following a line of thought started by @didierofrivia offline and adding more questions (as opposed to proposing any different solution to the problem):

Do we still need Limitador's RateLimit CRs after this?

I still see a benefit in those for fine-grained authorization in K8s RBAC, and for storing metadata about each individual limit (credits for the latter to @thomasmaas). I don't know if there's a use case for such level of control of the limits or if at the level of all limits of a particular Limitador instance together it would suffice. It's also true that Authorino could be used to deliver such level of fine-grained control (i.e. individual authz and metadata for the limits); though that option would mean adding a dependency of Limitador on Authorino to achieve such control, which might not be desired.

Elaborating further... What is the source of truth when/if users state part of the limits within a Limitador CR (which BTW naturally points to a specific Limitador deployment) but also in other loose RateLimit CRs (that somehow would have a way to also point to specific Limitador deployment, I'm guessing)?

Should the operator read the rules in a Limitador CR and distribute the actual reconciliation into the Limitador instance by generating individual RateLimit CRs out of it, that add up together to other RateLimit CRs occasionally defined by the user, thus creating a single pool of rate limit rules for a Limitador instance, from what used to be two sources of truth now to become one single consolidated source of truth? And it's then when injecting the limits into the Limitador instance actually begins?

For the issue of targeting a particular Limitador instance when defining a RateLimit CR, could Kubernetes Label and Selectors be a solution? We relied on that to assign Authorino AuthConfigs across multiple competing Authorino instances. (See Sharding in the Authorino docs for details; example available here.) That is also how API key (and soon mTLS root certs as well) Secrets "target" specific Authorino AuthConfigs. Of course, there's the difference that the relationship between Authorino AuthConfigs and Authorino instances (or between API key Secrets and AuthConfigs) can be of N X M type, and that the Authorino instances watch for all those AuthConfig CRs and Secrets themselves instead of depending on the operator reconcile the instance's state via some API or other mechanism to inject the config.

didierofrivia commented 2 years ago

Besides what @guicassolato mentioned regarding the RBAC auth and metadata, having the RateLimit CRD brings every single k8s API feature which might be a bliss for any cluster admin. However, it's true that limits are settings descriptors within the scope of the Limitador main concern, which one would've expected to been created that way from the beginning and then, if the necessity arises, split to its own CRD (if we ever find ourselves with a constrain regarding the size of the CRD, specific desired k8s API features, etc...)

eguzki commented 2 years ago

Regarding RBAC. Is there a use case where there are two personas, one responsible for deploying Limitador CR and another responsible to configure that Limitador deployment with some limits? In that case it would make sense to have one CRD for defining limits and another CRD for the limitador deployment.

If we need two CRDs, then we can discuss how the linking is done. Direct references or label selector based are the common approaches, indeed.

didierofrivia commented 2 years ago

As mentioned offline, the ease of managing a single CR with the Limitador Service settings and its configuration is a big plus... specially when the alternative would be reconciling n resources that need to have possible n references to Limitadors and these might be labels ....