Kuadrant / limitador-operator

Apache License 2.0
6 stars 13 forks source link

Production-ready: Configure Observability #77

Open slopezz opened 1 year ago

slopezz commented 1 year ago

In 3scale SaaS we have been using successfully limitador for a couple of years together with Redis, to protect all our public endpoints. However:

We would like to update how we manage limitador application, and use the most recommended limitador setup using limitador-operator, with a production-ready grade.

Current limitador-operator (at least the version 0.4.0 that we use):

Desired features:

3scale SaaS specific example

Example of the PodMonitor used in 3scale SaaS production to manage between 3,500 and 5,500 requests/second with 3 limitador pods (selector labels need to coincide with the labels managed right now by limitador-operator):

apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: limitador
spec:
  podMetricsEndpoints:
    - interval: 30s
      path: /metrics
      port: http
      scheme: http
  selector:
    matchLabels:
      app.kubernetes.io/name: limitador

Possible CR config

Both PodMonitor and GrafanaDashboard should be able to be customized via CR, but use default sane values if they are enabled, so you dont need to provide all the config if you dont want, and want to trust on defaults.

The initial dashboard would be provided by us initially (3scale SRE), can be embedded into operator as an asset, like done with 3scale-operator.

Current Dashboard screenshots including limitador metrics by limitador_namespace (the app being limited), and also pods, resources cpu/mem/net metrics:

image image image

PrometheusRules (aka prometheus alerts)

Regarding PrometheusRules (prometheus alerts), my advise is to not embed them into the operator, but provide in the repo a yaml with an example of possible alerts that can be deployed, tuned... by the app administrator if needed.

Example:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: limitador
spec:
  groups:
    - name: limitador.rules
      rules:
        - alert: LimitadorJobDown
          annotations:
            message: Prometheus Job {{ $labels.job }} on {{ $labels.namespace }} is DOWN
          expr: up{job=~".*limitador.*"} == 0
          for: 5m
          labels:
            severity: critical

        - alert: LimitadorPodDown
          annotations:
            message: Limitador pod {{ $labels.pod }} on {{ $labels.namespace }} is DOWN
          expr: limitador_up == 0
          for: 5m
          labels:
            severity: critical
Boomatang commented 1 year ago

I have a few points for decision which will affect how theses changes are done. The limitador-operator allows multiply limitador CRs in the same namespace and/or in multiply namespaces.

  1. Do we expect there to be a separate PodMonitor for every limitador CR? This seems wastefully as we could monitor many pods and namespaces with a single PodMonitor. But this brings its own issues.
  2. If the user does not configure the podMonitor section of one limitador CR instance, but there are other instances that are configured for pod monitors, should the non-configured instances also have podMonitor attached? I believe the podMonitors should not be added to the non-configured instances.
  3. I would expect if the podMonitor configuration in two different limitador CR states different label selectors that limitador-operator would create two different podMonitor configurations. This would mean that before creating any podMonitors the limitador-operator would first need to find any existing podMonitor CRs to update. Question is then who is responsible for removing the podMonitors during an uninstall? I would assume the last limitador CR to be removed. I am assuming we do not configure the podMonitors to check all namespaces but only the namespace we specify.
  4. If there is one podMonitor for a number limitador CR instances it would be in reason to suspect that there should be one dashboard to cover that selector label. Can the current 3scale dashboard handle multiply namespaces and instances of limitador?
  5. If a user is adding different label selectors for grafanaDashboards, it is possible there can be multiply grafana instances on the cluster? I am not sure how this would affect the deployments of the dashboards or pod monitors but something to look into.
slopezz commented 1 year ago

Hi @Boomatang ,

PodMonitor

For simplicity, I would treat the PodMonitor for a given limitador CR, as any other usual resource attached to the limitador CR, like the Service or the Deployment.

That means, each Limitador CR will have its own PodMonitor, with its own labelSelectors taken from the limitador CR, the same it has its own Deployment or Service, keeping things simple.

GrafanaDashboard

Regarding the dashboard, I didn't know limitador-operator could manage limitador CR instances on different namespaces or even multiple instances on the same namespace.

In our 3scale SaaS use case, we have a single limitador instance managing rate limits for any given namespace (since it is used the k8s Service name from envoy), and I guess that having a single instance would be the most usual case.

It is a bit tricky here, since with a single GrafanaDashboard you can view the metrics from any possible limitador instance (you would need to use the limitador instance name as the dashboard selector, aside from namespace maybe).

If you have multiple limitador instances on different namespaces, by default Grafana-operator will create a GrafanaDashboard on every Namespace, since the namespace name is used as a dashboard directory name.

image

However, what about multiple instances in the same namespace? TBH I don't know the best way to handle this situation, since any object created by the operator will add its own ownerReference annotation with the CR name...., and if you have multiple CRs creating the same dashboard name (so same resource) in the same namespace, it will only have the ownerReferences from one of them (which is not super bad, but maybe not ideal).

In our saas-operator case, we have multiple CRDs that can only be installed once in a namespace, but there is a single case where the CRD can have multiples instances on the same namespace, we finally ended up creating the same dashboard for every CR, using the CR name as a suffix for the dashboard name, so having multiple dashboards showing the same info, which it is not an ideal solution actually...

We have another scenario with prometheus-exporter-operator where we permit multiple CRs per namespace, but only a single dashboard per CR, so we end up with a single dashboard per namespace used to watch metrics from any CR, however the ownerReferences of the dashboard resource are from a single CR. If the CR associated to the dashboard is deleted, thanks to ownerReferences the dashboard resource will be deleted, but operator will detect there is a missing dashboard resource for the other possible instances and will create the dashboard again, but with a different dashboard ID (needed if you want the same dashboard URL).

So I don't see which could be the best solution here.

Do you know how other cluster operators manage the GrafanaDashboard when you can have multiple instances even in the same namespace?