cert-manager / cert-manager

Automatically provision and manage TLS certificates in Kubernetes
https://cert-manager.io
Apache License 2.0
11.75k stars 2.04k forks source link

Job label in Helm chart of cert-manager shouldn't be templated from chart name #7088

Open atillaqb opened 1 month ago

atillaqb commented 1 month ago

Is your feature request related to a problem? Please describe. Helm chart templating Service Monitor's job name using helm chart release name. When you are deploying helm chart using Argo CD - Argo CD inherit the helm release name from ArgoApps application name. For example: I've 3 different clusters with names: test/dev/prod. With Argo CD Application Set I've created an application which deploys cert-manager helm chart in 3 different clusters. Application Set is a form of templating to reduce repeatable code (https://argo-cd.readthedocs.io/en/stable/user-guide/application-set/) In an application set template I'm using a combination of cluster name (test/dev/prod) and application name (cert-manager). Which create ArgoApp with the name of dev-cert-manager, test-cert-manager, prod-cert-manager. Since Argo CD using ArgoApp name for naming helm release - it will lead that the helm release name will be inherit. (dev-cert-manager, test-cert-manager, prod-cert-manager). Since Service Monitor job name uses templating from release name - the final job name will be inherited. In the end when you will send metrics scraped from this Service Monitor - you will get Prometheus label job name with dev-cert-manager, test-cert-manager, prod-cert-manager. But the best practice - job name should be the same, no matter in which environment you application is running. Job name label represent the application itself. (Like cert-manager, loki, tempo, mimir, etc). And in the final - when you try to create dashboard for cert manager you will need to point to 3 different job names, instead of using another common label (like cluster).

Describe the solution you'd like Instead of using:

spec:
  jobLabel: {{ template "cert-manager.fullname" . }}

Create constant job label name - cert-manager:

spec:
  jobLabel: cert-manager

Additional context Argo CD application sets.

Environment details (remove if not applicable):

/kind feature

inteon commented 1 month ago

Could you share the docs that describe this best-practice?

atillaqb commented 1 month ago

Could you share the docs that describe this best-practice?

Hello! I take a notice that job name in a monitoring system (prometheus, loki) - most of the time (99 percent) of all dashboards, examples, docs - represent the name of the application / service - that should be scrapped (job_name=cert-manager). If you have more than 1 copy of application/service - you add another label to time series - for example instance (job_name=cert-manager, instance=10.9.8.1). If you have different environments - like 3 different clusters that running cert-manager - you can add additional label - "cluster". (job_name=cert-manager, instance=10.9.8.1, cluster = dev). In this example:

  1. job_name=cert-manager - static label which never changes across different environments.
  2. instance - can be changed because the pod ip address changes every time that you restart/create pod.
  3. cluster - can be changed because it represents the name of the environment.

    But because the job name form from release name of helm chart, and because ArgoCD generate release name from Argo application name - this job name label is not static. Why it's important - with dynamic "job name" label whenever I create a dashboard for metrics from cert-manager - I need to choose some label that represent your application across all environments. Otherwise, I will create 3 copies of 1 dashboard with equal metrics but with different labels. You can say that I can use variables in Grafana - and you will right, but what label I should choose as anchor - which represent application?

https://grafana.com/docs/loki/latest/get-started/labels/bp-labels/ - best practices for labels in loki, but the same logic behind applies to the prometheus labels as well (because they have equal architecture).

erikgb commented 1 week ago

/priority backlog /area deploy

inteon commented 5 days ago

FYI @wallrj

wallrj commented 5 days ago

@atillaqb What you wrote in https://github.com/cert-manager/cert-manager/issues/7088#issuecomment-2167346999 makes sense, but the documentation for PodMonitor and ServiceMonitor doesn't say anything about this best practice. Nor does it suggest using a static value. In fact the documentation says that the value should be the name of a label on the target Pod or Service.

jobLabel

The label to use to retrieve the job name from. jobLabel selects the label from the associated Kubernetes Pod object which will be used as the job label for all metrics.

For example if jobLabel is set to foo and the Kubernetes Pod object is labeled with foo: bar, then Prometheus adds the job="bar" label to all ingested metrics.

If the value of this field is empty, the job label of the metrics defaults to the namespace and name of the PodMonitor object (e.g. /).

So should it be something like jobLabel: app.kubernetes.io/name?