open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.88k stars 2.26k forks source link

Cardinality or memory limit for prometheus exporters #33540

Open LeoQuote opened 2 months ago

LeoQuote commented 2 months ago

Component(s)

exporter/prometheus, exporter/prometheusremotewrite

Is your feature request related to a problem? Please describe.

When the collector receives metrics, it occupies a portion of the memory, and when the workload stops sending metrics, this part of the memory is not released.

Memory growth may lead to memory limits being exceeded or excessively frequent garbage collection (GC), resulting in efficiency issues. Additionally, an excess of useless metrics can also cause storage and memory pressure on Prometheus.

Describe the solution you'd like

Provided a method to automatically expire related metrics at the collector level, it would alleviate the pressure on both the collector and Prometheus simultaneously.

Describe alternatives you've considered

Setting a cardinality limit could be an approach. If this limit is exceeded, the process should either exit or clean up the metrics. Developers can monitor the restart of the process to detect potential issues in real time.

Additional context

could be related to: https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/32511 https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/33324

github-actions[bot] commented 2 months ago

Pinging code owners:

wildum commented 2 months ago

Hi, isn't the memory released via the metric_expiration parameter in the prometheusexporter?

LeoQuote commented 2 months ago

Yes, it is released, but metrics could be too much that reached the very high level for a short time even before any metric expires