Open atillaqb opened 5 months ago
Could you share the docs that describe this best-practice?
Could you share the docs that describe this best-practice?
Hello! I take a notice that job name in a monitoring system (prometheus, loki) - most of the time (99 percent) of all dashboards, examples, docs - represent the name of the application / service - that should be scrapped (job_name=cert-manager). If you have more than 1 copy of application/service - you add another label to time series - for example instance (job_name=cert-manager, instance=10.9.8.1). If you have different environments - like 3 different clusters that running cert-manager - you can add additional label - "cluster". (job_name=cert-manager, instance=10.9.8.1, cluster = dev). In this example:
cluster - can be changed because it represents the name of the environment.
But because the job name form from release name of helm chart, and because ArgoCD generate release name from Argo application name - this job name label is not static. Why it's important - with dynamic "job name" label whenever I create a dashboard for metrics from cert-manager - I need to choose some label that represent your application across all environments. Otherwise, I will create 3 copies of 1 dashboard with equal metrics but with different labels. You can say that I can use variables in Grafana - and you will right, but what label I should choose as anchor - which represent application?
https://grafana.com/docs/loki/latest/get-started/labels/bp-labels/ - best practices for labels in loki, but the same logic behind applies to the prometheus labels as well (because they have equal architecture).
/priority backlog /area deploy
FYI @wallrj
@atillaqb What you wrote in https://github.com/cert-manager/cert-manager/issues/7088#issuecomment-2167346999 makes sense, but the documentation for PodMonitor and ServiceMonitor doesn't say anything about this best practice. Nor does it suggest using a static value. In fact the documentation says that the value should be the name of a label on the target Pod or Service.
jobLabel
The label to use to retrieve the job name from. jobLabel selects the label from the associated Kubernetes Pod object which will be used as the job label for all metrics.
For example if jobLabel is set to foo and the Kubernetes Pod object is labeled with foo: bar, then Prometheus adds the job="bar" label to all ingested metrics.
If the value of this field is empty, the job label of the metrics defaults to the namespace and name of the PodMonitor object (e.g.
/ ).
So should it be something like jobLabel: app.kubernetes.io/name
?
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
/reopen
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
/lifecycle rotten
/remove-lifecycle stale
/remove-lifecycle rotten
Is your feature request related to a problem? Please describe. Helm chart templating Service Monitor's job name using helm chart release name. When you are deploying helm chart using Argo CD - Argo CD inherit the helm release name from ArgoApps application name. For example: I've 3 different clusters with names: test/dev/prod. With Argo CD Application Set I've created an application which deploys cert-manager helm chart in 3 different clusters. Application Set is a form of templating to reduce repeatable code (https://argo-cd.readthedocs.io/en/stable/user-guide/application-set/) In an application set template I'm using a combination of cluster name (test/dev/prod) and application name (cert-manager). Which create ArgoApp with the name of dev-cert-manager, test-cert-manager, prod-cert-manager. Since Argo CD using ArgoApp name for naming helm release - it will lead that the helm release name will be inherit. (dev-cert-manager, test-cert-manager, prod-cert-manager). Since Service Monitor job name uses templating from release name - the final job name will be inherited. In the end when you will send metrics scraped from this Service Monitor - you will get Prometheus label job name with dev-cert-manager, test-cert-manager, prod-cert-manager. But the best practice - job name should be the same, no matter in which environment you application is running. Job name label represent the application itself. (Like cert-manager, loki, tempo, mimir, etc). And in the final - when you try to create dashboard for cert manager you will need to point to 3 different job names, instead of using another common label (like cluster).
Describe the solution you'd like Instead of using:
Create constant job label name - cert-manager:
Additional context Argo CD application sets.
Environment details (remove if not applicable):
/kind feature