metrics-rs / metrics

A metrics ecosystem for Rust.
MIT License
1.06k stars 145 forks source link

Steadily growing memory consumption when recording metrics #465

Closed IvanUkhov closed 4 months ago

IvanUkhov commented 4 months ago

Hello,

I have a service whose memory consumption is slightly but steadily growing during runtime. After profiling, I came to the conclusion that it originates from metrics. More specifically, it seems to be due to AtomicBucket::push in metrics-util creating Owned from crossbeam-epoch:

https://github.com/metrics-rs/metrics/blob/a143ef60b5354770fb292d29acc1f7bdef2c4aed/metrics-util/src/bucket.rs#L224

They are allocated but seem to hang around in memory indefinitely. Is it the expected behavior, or Is there something I am missing?

Skärmavbild 2024-03-18 kl  14 27 58
tobz commented 4 months ago

They shouldn't be hanging around in memory indefinitely. What's more likely is that the histograms aren't being drained often enough.

Are you running the Prometheus exporter in the default scrape mode? If so, how often are you scraping the endpoint? Doing a scrape drives the logic that moves histogram values from their AtomicBucket in the Registry, into the condensed form in the exporter state.

This is something we've addressed in the latest (0.14.0) release where they'll be drained every five seconds by default.

IvanUkhov commented 4 months ago

Are you running the Prometheus exporter in the default scrape mode? If so, how often are you scraping the endpoint?

I am on Kubernetes Engine with autopilot, and I have not done much customization. I simply make it listen to a dedicated port via PrometheusBuilder

PrometheusBuilder::new().with_http_listener(address).install()

and have the endpoint drained every 30 seconds

---
apiVersion: monitoring.googleapis.com/v1
kind: PodMonitoring
metadata:
  name: monitoring
spec:
  selector:
    matchLabels:
      instrumentation: service
  endpoints:
    - port: instrumentation
      interval: 30s

So the scraping is supposed to happen every 30 seconds, but the memory-consumption curves keep on growing for hours and days until the service happens to be redeployed. And metrics do show up in Cloud Monitoring; it is not that they are not collected.

Doing a scrape drives the logic that moves histogram values from their AtomicBucket in the Registry, into the condensed form in the exporter state.

Yeah, I would spontaneously expect to not have any kind of raw-value accumulation inside but direct bucketization. Now I know that this is not the case. The example I gave is then invalid, as I was running it locally without scraping. I thought it would not matter. I observed the same behavior as in the cluster and concluded that was it.

I will try to do some local scraping to confirm.

tobz commented 4 months ago

So, if scraping is happening in your real workload and memory is still increasing, then based on your code snippet, it sounds like maybe you're just accumulating idle metrics over time?

This is common with metrics that have unbounded cardinality, such as tags based on request IDs or user IDs or any sort of value that has a high cardinality in general.

If those metrics are emitted sparsely, the exporter is going hold on to them effectively forever, so the exporter state will just keep growing over time. This can be controlled by configuring the idle timeout (PrometheusBuilder::idle_timeout) which controls how we remove idle metrics from the registry, in terms of how long since a given metric has been last updated.

IvanUkhov commented 4 months ago

it sounds like maybe you're just accumulating idle metrics over time? This is common with metrics that have unbounded cardinality, such as tags based on request IDs or user IDs or any sort of value that has a high cardinality in general.

If you mean that, for instance, the name of the metric includes something that varies from request to request and makes them unique then no. It is just around five or so metrics with static names.

If those metrics are emitted sparsely, the exporter is going hold on to them effectively forever, so the exporter state will just keep growing over time. This can be controlled by configuring the idle timeout (PrometheusBuilder::idle_timeout) which controls how we remove idle metrics from the registry, in terms of how long since a given metric has been last updated.

In my setup, it is those five metrics that are being recorded upon each request. So in my mind, before you mentioned the need for scraping, it was five histogram under the hood (that is, a fixed memory requirement).

tobz commented 4 months ago

Well, having only five unique metrics certainly squashes the theory of it being unbounded cardinality.

Does the memory growth subside if you don't install a global recorder?

IvanUkhov commented 4 months ago

Now running locally a Prometheus process in addition. The memory usage is fluctuating all the time. Hard to tell, but there does not seem to be any upward trend anymore. I guess I just stumbled upon the accumulation you mentioned, which will be addressed in 0.14, and misinterpreted it as a reproduction of the behavior I see in the cloud.

Does the memory growth subside if you don't install a global recorder?

Then the memory usage is as flat as it can be. I will try doing it in production to rule out metrics.

About the scraping, is it that only a few values that are pushed to the registry upon each scraping request or all pending? Does one have to match the frequency of the scraping with the frequently of other requests to prevent atomic blocks from accumulating, in other words?

IvanUkhov commented 4 months ago

Removed the global recorder in the cluster. The readings disappeared as expected, but the memory consumption still keeps on growing. So it is something else. Not the recorder at least. But I suppose, without a recorder, metrics are no-op. So it is then some other code.

I apologize for the hassle and thank you for the help!

tobz commented 3 months ago

No worries at all, and good luck tracking down your leak! 🔍