grafana / mimir

Grafana Mimir provides horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus.
https://grafana.com/oss/mimir/
GNU Affero General Public License v3.0
4.12k stars 527 forks source link

Allow dropping additional labels in distributor #9711

Open tiithansen opened 1 week ago

tiithansen commented 1 week ago

Describe the feature request

We have a tiered prometheus setup where each tier has its own responsibility. Because of this we track HA labels differently. We have three labels in total. cluster which is used in queries, __prometheus_type__ which indicates the tier prometheus belongs to and __replica__ which indicates replica number in the tier. Because Mimir only drops __replica__ label we are left with __prometheus_type__ replica but we would like to get rid of it.

Reason for such setup is that if one tier becomes unstable others will be unaffected.

For example:

{cluster="prod-1", __prometheus_type__="business-shard-1", __replica__="1"}
{cluster="prod-1", __prometheus_type__="business-shard-0", __replica__="0"}
{cluster="prod-1", __prometheus_type__="system-shard-0", __replica__="1"}

Describe the solution you'd like

Allow specifying in config which additional labels distributor should drop from received timeseries.

Configured labels could be easily dropped here

Alternatives

I have tried drop_labels but it seems to run before ha tracker and it breaks ingestion.

narqo commented 1 week ago

Answering to the specific request:

Mimir supports metric_relabel_configs, that the distributor applies after the HA tracker. From the history, it was originally implemented for https://github.com/cortexproject/cortex/issues/1507, but it's been a niche experimental feature since then. There are some details on how to use it in https://github.com/grafana/mimir/issues/1809

Note that the config flag comes with a warning:

in most situations, it is more effective to use metrics relabeling directly in the Prometheus server, e.g. remote_write.write_relabel_configs.


We have three labels in total. cluster which is used in queries, __prometheus_type__ which indicates the tier prometheus belongs to and __replica__ which indicates replica number in the tier.

I cannot say I fully understand this setup. Do different "prometheus_type" prometheuses scrap same set of metrics or not? If yes, then wouldn't removing the __prometheus_type__ label break it, no matter if this happens before or after the HA tracker? It seems that distributor would end up injecting a set of duplicate metrics within one cluster label (providing __prometheus_type__ and __replica__ were removed as per your HA tracking rule).

tiithansen commented 1 week ago

One thing is forgot to mention is that cluster label in HA tracker is configured to __prometheus_type__ but also add regular cluster label when we remote write to Mimir.

Prometheuses with different __prometheus_type__ scrape different metrics from different services. For example __prometheus_type__="system" scrapes only metrics from kubernetes components, node exporters ... and __prometheus_type__="business" scrapes metrics only from applications developed by our developers.

This way if some business app explodes with cardinality we will still receive all system metrics and metrics from others shards of business prometheuses.