grafana / mimir

Grafana Mimir provides horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus.
https://grafana.com/oss/mimir/
GNU Affero General Public License v3.0
3.86k stars 468 forks source link

Deprecate thanos_ metrics prefix #4241

Open lamida opened 1 year ago

lamida commented 1 year ago

Is your feature request related to a problem? Please describe.

There are some metrics exposed by Mimir which have thanos_ as the prefix. That happens due to we are using some package from Thanos, specifically the cache package.

Example:

thanos_cache_dns_failures_total
thanos_cache_dns_lookups_total
thanos_cache_dns_provider_results
thanos_cache_getmulti_gate_duration_seconds_bucket
thanos_cache_getmulti_gate_duration_seconds_count
thanos_cache_getmulti_gate_duration_seconds_sum
thanos_cache_getmulti_gate_queries_concurrent_max
thanos_cache_getmulti_gate_queries_in_flight
thanos_cache_hits_total
...

Describe the solution you'd like

Deprecate metrics with thanos_ prefix and add a new metrics with cortex_ prefix instead. We can use TeeRegisterer to achieve this.

Describe alternatives you've considered

Additional context

See this (internal link).

pracucci commented 1 year ago

We also expose Mimir metrics with cortex_ prefixed instead of mimir_ due to the same legacy reasons as with the thanos_ ones (nowadays Mimir doesn't vendor Thanos anymore). Unfortunately, doing a massive metrics renaming is a pain and so we ended up never doing it (yet).

nabadger commented 1 year ago

This would be very useful and would also be consistent with a similar change done in Tempo.

In particular it's a bit painful managing migrations from cortex to mimir because they can trigger each other's alerts (we could mitigate this in https://github.com/grafana/mimir/issues/5260 though)