discourse / prometheus_exporter

A framework for collecting and aggregating prometheus metrics
MIT License
538 stars 156 forks source link

avoid global mutex for better concurrency #306

Open taylorchu opened 10 months ago

taylorchu commented 10 months ago

If we have a high rate of /send_metrics requests, both /metrics and /ping will time out. Because /ping is used for liveness check, the pod will be killed, and we end up dropping metrics. Initially, I thought it was related to webrick https://github.com/discourse/prometheus_exporter/issues/146, but it is more likely to be this global mutex https://github.com/discourse/prometheus_exporter/blob/239e2c60f93ecbb67e5701e3abb670f1a2783e5f/lib/prometheus_exporter/server/collector.rb#L10

We have about ~800 metrics, but the remote /send_metrics is about ~1000/s.