Closed justinbwood closed 1 month ago
I also like to see this. Really annoying with these three messages per Alert... I filed an issue over at grafana, but it seems like theres something wrong with amazon managed grafana config.
I've also been running into this issue. Opened a support ticket w/ AWS and the result was basically reflecting the doc that was linked in this comment. It seems like really bad UX to spam out alerts like this... I'd be interested to hear what workarounds others have used; I'm in the process of migrating over to managing the alerts using an external alert manager, Prometheus AlertManager, instead. Would be nice to be able to provision the alert rules in Grafana though!
As a workaround im using a Dynamodb and a Message hash in my Lambda that parses SNS. Like here:
https://gist.github.com/atze234/60dbef2991e08aba93b875c73578cf41
Also i set this in delivery_policy so that there is enough time to write to the db.
"defaultThrottlePolicy": {
"maxReceivesPerSecond": 1
},
This really is needed, since the "Classic" alerting is supposedly going away soon. It makes using Slack or Pagerduty impossible when monitoring large workloads, especially since classic alerts do not allow for template variables.
+1
is there any ETA for this please?
Spoke to AWS team about this today. They gave an "estimate" of Q1 2024 with possibility it might be as late as Q3 2024. According to them it's not a high priority issue for them and there are other issues they need to work on before that happens.
My biggest issue with it is that with Grafana managed service - alerting is advertised as a service feature.
I guess paying customers don't get a working feature until AWS deemed it worth fixing it...
We are also experiencing this issue. This is a primary feature of the service, and it is extremely disappointing that Amazon doesn't prioritize primary features of its products. We have waited for 1.5 years for Amazon to make 9.4 available in AMG so that we could use the alerting that is part of 9.4. Alerting is the only feature of 9.4 that we needed. It was/is the biggest reason to upgrade to 9.4. Now, we might further delay upgrading until as late Q3 2024 making it more than 2.5 years.
The purpose of the above rant is to add my vote to the priority of this issue.
+1
@VermaPriyanka do we have any updates on this and when should we expect a fix? This is really important to us!
FYI @VermaPriyanka this is a showstopper for us. We considered various solutions for providing an observability service to our engineering teams and settled on Managed Grafana expecting it to Just Work. Now after a significant investment of resources to get set up and put processes in place, we've hit this bug which renders the service unfit for use. Alerting is core functionality and we cannot expect other teams to accept all of their alerts appearing 3x in Slack!
We would really appreciate a fix for this ASAP or at the very least an ETA on a fix and a standard workaround until the fix arrives.
workaround while we're waiting https://github.com/flashbots/prometheus-sns-lambda-slack
Thank you all for the patience and for sharing workarounds. We understand that this is an important issue to solve and are working towards the same.
+1
AWS released Grafana 10.4 yesterday, and it's still an issue.
Strangely, this was their response to the alerting in HA issue.
https://docs.aws.amazon.com/grafana/latest/userguide/v10-alerting-explore-high-availability.html
AWS released Grafana 10.4 yesterday, and it's still an issue.
Strangely, this was their response to the alerting in HA issue.
https://docs.aws.amazon.com/grafana/latest/userguide/v10-alerting-explore-high-availability.html
Yeah this is the WORST bug, I am not even sure how they can release with this issue, its been a year now, we are still stuck on the old legacy alerts because of this. That documentation almost suggests they won't fix this and its working as they designed it
Thank you for voicing this concern. We are working towards a fix for the duplicate notifications issue in version 10. The description here explains the current workings of Grafana alerting, which implies rules are evaluated per HA instance. We are working towards solving this in 2 steps - focusing on solving the duplicate notifications first and to eliminate duplicate evaluations in the long term. We understand this has been a long wait, and are working towards releasing a fix soon.
Facing the same issue. Do you have any workarounds for slack?
Thank you for voicing this concern. We are working towards a fix for the duplicate notifications issue in version 10. The description here explains the current workings of Grafana alerting, which implies rules are evaluated per HA instance. We are working towards solving this in 2 steps - focusing on solving the duplicate notifications first and to eliminate duplicate evaluations in the long term. We understand this has been a long wait, and are working towards releasing a fix soon.
How fast can we get an fix for this, we are currently setting up alerting and its a real pain to receive all alerts 3x...
Thank you for voicing this concern. We are working towards a fix for the duplicate notifications issue in version 10. The description here explains the current workings of Grafana alerting, which implies rules are evaluated per HA instance. We are working towards solving this in 2 steps - focusing on solving the duplicate notifications first and to eliminate duplicate evaluations in the long term. We understand this has been a long wait, and are working towards releasing a fix soon.
any updates on this nasty ,,feature"?
We are also facing the same issue and would really appreciate on how and when this will be fixed by aws. Do you have any fix ETA on this @VermaPriyanka ? when is the fix supposed to be released for managed grafana? I am currently on Grafana v10.4.1 and still see this issue on aws managed grafana.
Hi @VermaPriyanka, Any update on this issue?
This is shipping soon on Managed Grafana v10.4 workspaces. Folks who have implemented workarounds to avoid the multiple notifications, do you see any concern as this fix is shipped - any breaking experiences or impact to your alerting flow?
Will you be patching 10.4 in place? Are you releasing a new minor patch to 10.4? The above statement is slightly confusing because 10.4 has already shipped.
Hi @VermaPriyanka , we are about to implement such a workaraound (detriplication on a FIFO-SQS-SNS-basis or with Prometheus). If this feature is shipping soon, it might not be worth it. So can you specify the "soon"-part of your post (and also @kevdonde 's question regarding the versioning)? Thanks in advance,
@kevdonde @ingMor It will be in place for all 10.4 workspaces - new, existing or upgraded. If you have additional/more specific questions, you can send them via mail to aws-grafana-feedback@amazon.com.
This is shipping soon on Managed Grafana v10.4 workspaces. Folks who have implemented workarounds to avoid the multiple notifications, do you see any concern as this fix is shipped - any breaking experiences or impact to your alerting flow?
That's great to hear. We have been avoiding creating alerts on Managed Grafana and creating on Prometheus or Cloudwatch, but our idea is to centralize all on Grafana.
Looking forward for the release!
@VermaPriyanka when is this fix coming? Currently, Alerting is unusable due to spam of multiple alerts
Hi @VermaPriyanka ,
Could you provide an update on when the fix will be released? Any ETA or additional details would be appreciated.
Thanks!
I'm also standing by for the workspace fix.
Hello @VermaPriyanka, please let us know when the fix will be released. Today, we have updated AMG from 9.4
to 10.4
but the notification duplication issue persists, we tried setting different group_interval
and group_wait
options but no luck!
Notifications are sent 3x and there is no support for Email or MS teams integration in contact points. The notification template does nothing. Thanks for "Amazon RUINED Grafana".
Notifications are sent 3x and there is no support for Email or MS teams integration in contact points. The notification template does nothing. Thanks for "Amazon RUINED Grafana".
They didn't ruined anything, you can run Grafana in an EC2 and manage it yourself you have all the options you want. Only then you are responsible for maintenance. If you don't want that you are bound to the managed Grafana. But all options you just provided are marked in the documentation that its currently not supported by AMG
To get ontopic again, hopfully this will be released soon. and are we able to deduplicate the messages.
edit
BTW you can get email notifications through sns but be aware you get 3 emails for each alert, untill this issue is resolved.
+1 Real pain
@VermaPriyanka : We are using Grafana-oss v11.1.3 , after upgrade from v11.0.0 we are facing this Triple firing issue ,when we can expect the solution?
@Inquisitive1a This is a public roadmap for Amazon Managed Grafana. I'm unsure if you are using self-managed Grafana or Grafana Cloud.
Thank you all for being patient for this update. Notifications have been sent to existing customers using Grafana alerts in Amazon Managed Grafana, about the release of an update that will prevent multiple notifications. Will share an update here, once this is available on all Amazon Managed Grafana v10.4 workspaces.
Thanks for the update. We are getting these notifications in many projects. Customers will be happy. Let's expect 1-2 weeks to get this update for everyone :) And check your operational contact or default contact email. If you will the exact update date inside them.
According to our AWS Health Dashboard notifications, it looks like Sept 14th is the date to expect for the patch to eliminate multiple alerts, you must have a v10.4 workspace for the update.
Starting September 14, 2024, we will release an update that prevents multiple alert notifications sent to your alert destinations/contact points [2], from Grafana managed alert rules.
This update is only available for Amazon Managed Grafana version 10.4 workspaces. If you are running Grafana version 8.4 or 9.4, you must upgrade your workspace to Grafana version 10.4 to receive this update.
Thanks all for sharing here. Would like to clarify, the release starts on 9/14, so it may be a couple of days from then for you to see the effect in your workspace, depending on which region you are in.
thats good to know! thanks @VermaPriyanka
I can confirm that the fix is working and i am only getting 1 notification now per server.
@Diondk I am still getting 3 notification, should I have to make any changes to apply the patch ?
@Diondk I am still getting 3 notification, should I have to make any changes to apply the patch ?
no thats not needed, but please note that the release started at 9/14. Could be a couple of days before you see the effect in your workspace.
@Diondk Which region are you using for your Workspace ? Since my workspace is in us-east-1 and there is currently no fix for it.
We are in eu-central-1
and still no fix either.
@VermaPriyanka Can you please confirm in which region is it deployed ?
@Diondk Which region are you using for your Workspace ? Since my workspace is in us-east-1 and there is currently no fix for it.
We are in EU-WEST-1, there was also no fix for me to apply myself, it was fixed when i came in the office on monday.
@VermaPriyanka We are yet to receive any fix, we are on version 10.4
and running AMG in eu-west-1
, but alerts are still getting triggered thrice! Any ETA on when other users will receive the fix?
Thank you all for your patience. We understand the anxiety at this time and would like to inform that the update has been released for all new Amazon Managed Grafana version 10 workspaces, and the release for existing version 10 workspaces is in progress. We expect the update to be worldwide by next week. No action is required from the customers for this update. Advance notification stating that the release starts on 9/14 was sent out to inform customers about the upcoming change in alert notifications behavior.
Per the AWS Managed Grafana docs on migrating classic alerts to Grafana alerting, multiple notifications are sent when using Grafana-managed alerts.
I would like to see Grafana's high availability alerting enabled so that notifications are properly deduplicated, as it's a bit frustrating to receive Slack notifications in triplicate when using Unified Alerting.
Thanks!