deliveryhero / helm-charts

Helm Charts ⛵ @ Delivery Hero ⭐
Apache License 2.0
487 stars 286 forks source link

[node-local-dns] duplicated metrics #600

Closed msvticket closed 1 month ago

msvticket commented 2 months ago

By default the coredns metrics get scraped twice, since both the service and the pods have the annotations

prometheus.io/port: "9253" prometheus.io/scrape: "true"

Since the service has ClusterIP: None each pod will be scraped. So if you issue a PromQL query like sum(rate(coredns_dns_requests_total{k8s_app=~"node-local-dns"}[5m])) by (proto) the results will be double the "actual" value. Sure, this can be fixed with adding for example job = "kubernetes-service-endpoints" to the query, but that complicates the query unnecessarily and you need to understand the problem to figure it out.

I suggest removing the prometheus annotations from the pod. Removing them from the service would also work, but since you could add them back yourself to the pod using podAnnotations I think it's better to remove them from there. In the service you would still be able to control them using prometheusScraping.enabled.

brennoo commented 2 months ago

hello @msvticket,

it would make more sense to keep the pods and remove the annotation from the service as calling the service we would not control which pod is replying. the pod "dimension" might be useful to detect some issues with pod/nodes - wdyt?

would you like to raise a PR addressing this issue?

msvticket commented 1 month ago

hello @msvticket,

it would make more sense to keep the pods and remove the annotation from the service as calling the service we would not control which pod is replying. the pod "dimension" might be useful to detect some issues with pod/nodes - wdyt?

You can identify the separate pods using the instance dimension even when it is the service that is scraped. But you wouldn't have direct identification of the pod, no.

The reason I suggested removing the scraping of the pods is that if somebody rely on the scraping of the pods they can add it again with podAnnotations, while we wouldn't have support for adding the annotation to the service if we removed those.

What we could do is to remove the scraping annotations from the service and at the same time introduce a new value, say service.annotations.

I can make a PR.

github-actions[bot] commented 1 month ago

This issue is stale because it has been open 14 days with no activity. Remove stale label or comment or this will be closed in 5 days.