prometheus / alertmanager

Prometheus Alertmanager
https://prometheus.io
Apache License 2.0
6.58k stars 2.14k forks source link

feature request - templated route matcher expressions #4032

Open grozan opened 4 days ago

grozan commented 4 days ago

Hi,

I see I can define matcher rules like this

route:
  routes:
    - matchers:
        - alertname =~ ".*foo.*"
      receiver: my_receiver

what I'd want would be the possibility to leverage the alerts' labels, and the available functions, to dynamically construct that regex with something like

route:
  routes:
    - matchers:
        - alertname =~ ".*{{ .CommonLabels.some_label  }}.*"
      receiver: my_receiver

and even

route:
  routes:
    - matchers:
        - alertname =~ ".*{{ .CommonLabels.some_label | reReplaceAll 'foo' 'bar'  }}.*"
      receiver: my_receiver

That would let admins define small but dynamic set of rules, that would let users tweak the behavior of the alerting system simply by defining the labels/values of their alerts

dwilliams782 commented 4 days ago

Being able to set things like group_by, group_interval, repeat_interval dynamically in the routes based on labels would be extremely useful.

grobinson-grafana commented 4 days ago

I don't understand how this would work? For example, CommonLabels does not exist until after an alert has matched a route, so it would be impossible for it to be used in the route definition.

grozan commented 3 days ago

@grobinson-grafana ah ok, sorry about that, was not aware of the CommonLabels thing. So it would be a different syntax, but hopefully the point is still clear enough

matcher rules compare a "property" of an alert, like its alertname, severity, other_label_the_alert_has, etc, with an expression. The ask here is to be able to have dynamically computed expression strings (the right part), instead of hard-coded ones like the static .*foo.*, to make it possible to derive the expression string from the value of other_label_the_alert_has for example.

Terminology is certainly wrong, but hopefully it still makes sense

grobinson-grafana commented 3 days ago

If I understand correct, you want to be able to do something like this:

route:
  routes:
    - matchers:
        - alertname =~ "foo{{ .Labels.bar  }}.*"

But I still don't quite understand what advantages this has for Alertmanager users?

However, the big problem is I think this will be almost impossible to implement for one simple reason. These regexes need to be compiled and cached in memory. Right now, the number of regexes is equal to the number of matchers in all routes. But if we need to recompile the regex for every alert, you could end up having to compile millions of regexes (suppose 10,000 alerts, 10 routes, 10 matchers each = 1,000,000 regular expressions). This would cause enormous CPU spikes and I just don't see how we could optimize it.