Closed calind closed 3 years ago
+1
+1 Would be nice to be able to add alerting on server with a low life time (AWS auto scaling), auto register the server on grafana is easy with the templating but it's sad to not be able to put alerting on them
@bergquist it's unpractical using all for example when you have more than a dozen hosts.
If for example only few of them are failing, (let's say 5), it is very useful to receive an email for each failing alert. This way is also much easier to integrate with other tools which in general expect one alert per metric.
The current approach (using all) is pretty neat though when there are fewer instances or when you are alerting at service level (eg. # of jobs in queue).
what @calind said, i've got multiple $host variables wich are working fine with the influxDB but not with the alerts
+1 as well.
Just a thought, since you are able to query with a template variable, wouldn't you just be able to do the same query with the alerting metrics and maybe iterate through the results to see which meet the alert criteria?
@NotSoCleverLogin It would be possible. But would you want to change the behavior of alert rule based on what template varlue are selected?
Using the all option for the template is the only way that makes sense for me.
+1
I have a setup of X environments with the same components in each environment. We are currently using prometheus to alert on e.g cpu usage/disk usage etc. There we specify an alert for a query, and when the alert is triggered it will just state which environment the alert was triggered from.
If we would do this with the All variable, that would work to some extent. But, using @calind's example, the screenshot would be filled with the trend of all cpus from all of my environments, and not just the environment where I would want to be informed about said problem. The graph will (or can) be obscured with information from other environments. In some scenarios it could be interesting to compare cpu in other environments, but there are no guarantees that what is happening in a test environment is happening in our production environment, etc.
We are also looking into creating dashboards that can be used by operations, showing annotations for alerts in the "standard" overview dashboard. Given that we use 'env' template variables for these kind of dashboards it's not really possible for us to do that with how it is implemented right now. I would have to manually (at least to some extent) generate a "shadow" dashboard where the alerts are triggered (which makes me loose the annotations in the overview dashboard).
Another thing I think template variables can help you do is to route the alerts (should you choose to implement such a feature) to different sources (some to operations if in production, to qa/developers if in test environments etc).
+1 for supporting alerts on templated queries.
@bergquist, some dashboards don't have an All option. For example system metrics by collectd (https://grafana.net/dashboards/24). Having an All option would certainly not be practical for let's say 10 or more servers. That's why the need to iterate trough template variables.
Allowing use of All is a good and welcomed start.
In Prometheus, queries need to be written in a different way to allow All:
some.metric{hostname=~"$Hostname"}
Notice the extra tilde there, allowing for regular expression searching (and the wildcard in All).
I have not benchmarked the possible performance impact of going from a straight query to a regex search query but at least for now it would apparently solve our problems.
+1
+1
not sure how it should be implemented, just know it's needed..
+1 We use Prometheus as the Datasource to monitor our Kubernetes Infrastructure for bout our On-Prem K8S Clusters and our AWS K8S Clusters. All of our dashboards use Templated Variables for the Datasource ($Environment), $Instance/Node, $Namespace, and $Pod. Due to the way the Prometheus Query Structure is; all of the queries have Templated Variables; which prevents the Alert Rules from allowing to save. I would love to see Templated Variable Queries added to the alerting.
+1
+1 We use templating dashboards for multi-server environment which is the logical way (and many people use), So we can't use alerting with grafana right now. The only way is to have a separate non-templating dashboard or setup alerting with prometheus itself which is not easy.
perhaps if there was an option or simple way to save/export a dashboard with the template variables backed/pre-rendered into all the fields... this would perhaps be a good half way point until another solution is found.
+1 for supporting alerts on templated queries. We currently use templating on all our dashboards so can't take advantage of this really cool feature.
+1, we have a lot of templated dashboards, and we can't use alerting for now, we have to deduplicate dashboards for having alerts, and we so lose templating power
+1, Almost all of our dashboards use template variables (and nested template variables).
We would like to be able to set alerts on repeat panels to get individual alerts per template-variable group if needed. Plus this means that the alerting is dynamic and not super manual as it is now.
DANGER: Variables in theory will be good to have, but we need to keep in mind that if some guy goes into your dashboard and changes the value and saves, the resulting alerting will be affected. Don't know if that's ok behaviour or not, will be complicated.
+1
When working with grafana it feels like templating is encouraged everywhere and it feels wrong to create an extra set of graphs not using variables just to use the alerting feature...
+1 for supporting alerts on templated queries. also, we found that when we use Chinese ruleName or Chinese title, we received abnormal email with rule triggered. For example, we expected “个股分时线接口请求时间(getTimeTrend) alert” but received "个è¡åæ¶çº¿æ¥å£è¯·æ±æ¶é´(getTimeTrend) alert", maybe the charset is not correct.
+1 to implement templated vars in alerts
+1
+1 would get a great addition
+1
+1 to implement templated vars in alerts
+1
+1 looking forward for it
+1
+1
+1
+1
Please stop writing +1
!
Everybody that has subscribed to this issue will get an email...
There is a github feature only to get rid of those +1
comments:
https://github.com/blog/2119-add-reactions-to-pull-requests-issues-and-comments
@thetechnick There is a link in the e-mail where you can mute the thread and not receive any e-mails. But I understand that you might want to just get notified when the feature is complete, but I also like to get the issue bumped so that it hopefully will get worked on sooner :)
Great progress on alerting overall. For the template variables in alerting, I am missing it as well. +1 :D
=
On top of that there might be a bug in a way Grafana detect whether metric used in query
uses the template variables.
When you've a series which uses the template variables indirectly, Grafana does not stop you to add that series as an alert. The alert obviously does not work correctly.
See the #K
(it uses #D, which uses #A and #A uses templ. var):
I could still select it:
Templates everywhere, which means alerting no where. Not sure how the alerting has been implemented, but for a simple graph the query gets "translated", template variables substituted with values, before making a call to the data source, right? So why not in this case? In any way, as said before, having almost all of the queries using template variables, alerting is completely of for me. Please, could you implement it so that we don't have to move alerting outside Grafana? Thanks a lot!
I think we should recognise that alerting with templating is not trivial and i think the ALL options is the way to go because we dont want our alerts changing when someone is using the dashboard. But grafana still would have to create new alerts if the template query returns new results... which happens quiet often as we scale our apps. This leads to more problems if you are using InfluxDB as many of us are using tags/tag values i guess, and there is no time filter for them... so grafana would create alerts for all service that ever existed on any host...
+1
Just allowing to specify datasource in alerting would be ok for me. It won't break any logic, and i can specify at least production and staging environments to watch for.
ALL is an option, sure. More flexible would be a recognition of the template variables in the query and letting the user set the values up in the alert condition configuration. The best, but complicated I guess, would be to have multiple alerts (the same way there are multiple queries) so that a different alert could be set up for a different template variable values in the query. This would enable the administrator to set up different alert conditions for different hosts for example.
Multiple alerting profiles would be great, but for an initial pass, just providing the same template selectors as are available on the dashboard in the alerting panel would solve a lot of problems.
I also think there should be an toggle for each variable to aggregate results for that variable into a single notification, this is probably only enabled for template vars that have multi-select enabled. This provides a simple but effective method to control the verbosity of notifications - you may want to notify only once for multiple related metrics, but notify for each host where any metric is failing. Or, you may want to notify only once for a failing metric no matter how many hosts are affected.
do we have any targeted milestone for this bug ?
I had some issues with the alerting on a complicated queries and template variables queries. I've found out easy workaround, which maybe not pretty, but it works for my use case. It's just extracting the query after you built it, so there are no template variables and any #ROW references. This could be obvious for you, there is no rocket science, but to me it was life changer.
What I do is I prepare a query:
then extract it using the Chrome dev tools (copy target parameter value):
Put it in another row (switch to toggle edit mode first):
Set up the alerting:
Voila !
@siteshbehera This is not a bug. Its a feature request.
But no. We dont have a milestone for this currently.
artificial intelligence grafana plugin should be included in commit for this feature.
Waiting for templates in Alerts too +1
I'm also very much in favor of what calind provided as possible implementation in the opening post. It seems to fits neatly into how many (me included) use templated dashboards - where you have one dashboard, but switch/limit some variables to manually look at specific things. I think the example of the "server"-variable might be the most fitting one. There, the template variable (without all-value) would become something not unlike a "tab" in my dashboard - I can switch between them to see different sets of data. It's then easy to assume that, when setting up an alert, the alert would exist for each possible "tab" seperately.
It would be pretty useful if grafana would support alerting for queries using template variables. The way I see it work it would be as follows:
The current workaround is to use an invisible wildcard metric, but the problem I see with this approach is that it loses context.