micrometer-metrics / micrometer

An application observability facade for the most popular observability tools. Think SLF4J, but for observability.
https://micrometer.io
Apache License 2.0
4.46k stars 984 forks source link

Metrics in "/actuator/prometheus" are not consistent in a multi nodes environment (kubernetes) #5293

Closed ZakariaAitErrami closed 3 months ago

ZakariaAitErrami commented 3 months ago

Description My application is built using Spring Boot and is deployed with multiple replicas in a Kubernetes cluster. I am using the /actuator/prometheus endpoint to expose metrics. However, I've noticed that the metrics returned by this endpoint are different and inconsistent depending on which server instance is accessed. When I refresh the /actuator/prometheus page, the values change based on the server I hit.

Is there a way to make the metric values consistent across all nodes?

jonatan-ivanov commented 3 months ago

I'm not sure I get this, values must be different since not all of your nodes will receive the same amount of traffic, consume the same amount of resources, etc.

Please let us know if I misunderstood your issue and please provide concrete examples (formatted Prometheus output snippets) and we can reopen this issue.

shakuzen commented 3 months ago

It sounds like @ZakariaAitErrami may be accessing the Prometheus scrape endpoint via a load balancer, so the scrape results they are getting may be from a different instance each time. As Jonatan mentioned, metrics are specific to each instance as they are accumulated in memory in each instance. The Prometheus model of metrics collection generally expects that the Prometheus server can individually scrape instances and that the metrics scraped from each instance are uniquely labelled (tagged). Thus, accessing the scrape endpoint via a load balancer is not expected.

jonatan-ivanov commented 3 months ago

In that light (using a load balancer), I think the answer to this:

Is there a way to make the metric values consistent across all nodes?

could be: use a Prometheus Server that can scrape all of your instances and you can aggregate/query metrics in Prometheus. Prometheus is pretty standard on Kubernetes, you can find many guides to set it up with Spring Boot apps.