akoutmos / prom_ex

An Elixir Prometheus metrics collection library built on top of Telemetry with accompanying Grafana dashboards
MIT License
577 stars 96 forks source link

[BUG] metrics are very slow due to number of subdomains #227

Open feld opened 6 months ago

feld commented 6 months ago

This bug is related to #183

As it turns out I have an application that has a wildcard subdomain. Our application is rather unique and we have a nearly infinite number of possible subdomains which could all produce unique content. Once the application has been running for a couple hours the metrics endpoint starts to time out for our Prometheus scraper.

I'm currently measuring on one of our nodes ~2900 unique domains known to PromEx which causes the metrics endpoint to take between 5-10 seconds to serve. The resulting output is 218873 lines long. That's a lot of metrics!

Would it be possible to disable the unique host tag being added to all the metrics? We don't need them per-host like this.