discourse / prometheus_exporter

A framework for collecting and aggregating prometheus metrics
MIT License
525 stars 153 forks source link

Sidekiq Stats Collector Memory Leak #240

Open gnomus opened 2 years ago

gnomus commented 2 years ago

Hello there,

we have Investigated an Issue with one of our prometheus_exporter Setups which stopped working. After a certain perioud of runtime the collector started to report collector_working with Value 0 and all our Instrumentation Metrics went missing. The only difference we found to another Setup of our which has no issues was the usage of the SideKiqStats Instrumentation so we looked into that.

It seems that here new observations are added to the sidekiq_metrics object which results in it getting bigger and bigger. After a certain amount of time the Collector is not able to generate the Metric Text withing the configured default Timeout of 2 Seconds and we start seeing the above mentioned behavior.

As a comparison the SidekiqQueueCollector seems to have a mechanism to clean up older Observations.

Our guess would be that the SidekiqStatsCollector would need something like that too.

NickLarsenNZ commented 1 year ago

Does #256 close this issue?