aws / amazon-managed-grafana-roadmap

Amazon Managed Grafana Roadmap
Other
60 stars 4 forks source link

AWS managed grafana fires a notification 3 times #55

Closed trinh-tien-dat closed 1 month ago

trinh-tien-dat commented 1 year ago

Hi Team, I'm using AWS managed Grafana service.

When I use Notification policies to route my alerts to difference Contact points, when an alert rule fires, I get 3 similar notifications at the same time.

I ran into some search and it seems it is a bug of AWS Managed Grafana service.

is there anyone can fix it by yourself? or any AWS Guys in this thread can tell me why it happens and when you can fix it?

Thank you!

rpractice commented 1 year ago

Yes Same for us as well, seems like a bug in AMG

trinh-tien-dat commented 1 year ago

@rpractice Hi, I'm using prometheus alertmanager as an external alertmanger instead of Grafana default alertmanager to take care of the duplicating, I will let you know the result in couple days.

rpractice commented 1 year ago

@trinh-tien-dat , Did you manage to fix this issue using Prometheus?

trinh-tien-dat commented 1 year ago

yes, it worked, what I have done is that I installed prometheus-alertmanager on an EC2 instance, then I config it to route incoming alerts to AWS SNS, prometheus-alertmanager groups the alerts and de-duplicated them. Then, I go to AWS Grafana service --> contact point --> alertmanager --> I set up the prometheus-alertmanager url and an account to authen.

It worked well, we can use this when we wait for them to fix it.

brc commented 1 year ago

Is this the behavior that's documented in the second bullet point here? https://docs.aws.amazon.com/grafana/latest/userguide/v9-alerts.html#v9-alert-limitations

trinh-tien-dat commented 1 year ago

@invsblduck Hi, I'm not sure if it relates to my issue.

chr2che commented 1 year ago

is there an ETA for this to be fixed ?

brc commented 1 year ago

@trinh-tien-dat Apologies that I phrased it as a question for you. This behavior is documented as a limitation at the link I shared:

Alert rules defined in Grafana, rather than in Prometheus, send multiple notifications to your contact point.

This is also a duplicate of #47

trinh-tien-dat commented 1 year ago

There is an interesting thing, on AWS Grafana service, when you define some rules and config only 1 contact point (It's also the default contact point) for the rules, then there is no duplication notifications, it works perfect (I have 1 contact point AWS SNS). Then, you have more contact points (>= 2), at this time you have to define Notification policies to route the alerts to right contact point as you wish, there the duplication issue occurs, I tried many times from my AWS grafana service, and I could say: 1 contact point for all rules: worked multiple contact points, each contact point for some of the rules: you will have the issue. Thanks! Dat.

lorelei-rupp-imprivata commented 1 year ago

This seems like an unfortunate issue that AMG has and AMG should solve it. While they do call it out its expected to me that is just a known bug. AMG shouldn't limit you from using the new alerting feature if you want to have your alerts in grafana vs Prometheus, esp if you are using other data sources that are not Prometheus you may want to have grafana alerts. This really limits you from using the new alerting features in grafana that are way more powerful then the legacy ones

0x416e746f6e commented 9 months ago

f.w.i.w. we ended up with writing our own lambda (for slack alerts) that does deduplication. alerts that are coming in triplets are not just ridiculous, but also encourage people stopping paying attention to them.

https://github.com/flashbots/prometheus-sns-lambda-slack

webertrlz commented 9 months ago

+1 this is PITA.

MaGaudin commented 5 months ago

The Bug we are talking about is still there even withManaged Grafana 10.4 new version. @mhausenblas are you kindly able to tell us when it will be solved?

VermaPriyanka commented 5 months ago

@MaGaudin The bug fix is in progress and will be available in Managed Grafana version 10.4. See my response here

MaGaudin commented 5 months ago

Hi @VermaPriyanka. We updated to Grafana 10.4 but the bug is still there.

VermaPriyanka commented 5 months ago

Like I mentioned the work is still in progress. Will post an update here, once its rolled out.

MaGaudin commented 5 months ago

Ok Thanks for clarifying. Hope this will come asap being a vary nasty bug that can bring users to change product. In the meantime, have you any advise on some workarounds? What Can we do to mitigate this problem?

MaGaudin commented 5 months ago

There is an interesting thing, on AWS Grafana service, when you define some rules and config only 1 contact point (It's also the default contact point) for the rules, then there is no duplication notifications, it works perfect (I have 1 contact point AWS SNS). Then, you have more contact points (>= 2), at this time you have to define Notification policies to route the alerts to right contact point as you wish, there the duplication issue occurs, I tried many times from my AWS grafana service, and I could say: 1 contact point for all rules: worked multiple contact points, each contact point for some of the rules: you will have the issue. Thanks! Dat.

Unfortunately, With Grafana 10.4, this solution does not seem to work anymore. Has anyone else noticed the same?

lorelei-rupp-imprivata commented 5 months ago

Ok Thanks for clarifying. Hope this will come asap being a vary nasty bug that can bring users to change product. In the meantime, have you any advise on some workarounds? What Can we do to mitigate this problem?

We have been waiting just about a year for a fix... and none yet

tanjilbhuiyan commented 2 months ago

There should be an update to Grafana 10.4 after september 14, 2024, that will resolve the Duplicate alert issue. This update is only available for Amazon Managed Grafana version 10.4 workspaces. If you are running Grafana version 8.4 or 9.4, you must upgrade your workspace to Grafana version 10.4 to receive this update.

trinh-tien-dat commented 2 months ago

There should be an update to Grafana 10.4 after september 14, 2024, that will resolve the Duplicate alert issue. This update is only available for Amazon Managed Grafana version 10.4 workspaces. If you are running Grafana version 8.4 or 9.4, you must upgrade your workspace to Grafana version 10.4 to receive this update.

That's a great new, I'll try this as soon as possible this month.

rpractice commented 2 months ago

Has anyone tried if it is working in the new version properly?

enriquegaldu commented 1 month ago

Yes, it is working now. Starting on 21st of september our 10.4 installation has started to fire only one notification per alert. At last!

rpractice commented 1 month ago

Yes, it is working now. Starting on 21st of september our 10.4 installation has started to fire only one notification per alert. At last!

@VermaPriyanka , Can you please confirm if the new version has a fix for the duplicate notification, so we can update to the latest version.

trinh-tien-dat commented 1 month ago

Yes, it is working now. Starting on 21st of september our 10.4 installation has started to fire only one notification per alert. At last!

@VermaPriyanka , Can you please confirm if the new version has a fix for the duplicate notification, so we can update to the latest version.

Please be careful, I did upgrade from 9.4 to 10.4, and I got a lot of issues, Now I create a new Workspace 10.4 and migrate things from the old to the new one manually.

My test was: I created a new workspace 9.4, then use this tool to clone the current workspace to the new workspace: https://github.com/aws-observability/amazon-managed-grafana-migrator, Then I did the upgrade it from 9.4 to 10.4.

VermaPriyanka commented 1 month ago

Thank you all for your patience. Happy to share that the update to prevent multiple alert notifications sent to your contact points from Grafana managed alert rules, is now available on all Amazon Managed Grafana workspaces running Grafana version 10.4, across all AWS regions where Amazon Managed Grafana is generally available.

If you are running Grafana version 8.4 or 9.4, you must upgrade your workspace to Grafana version 10.4 to receive this update. For instructions on how to update your workspace(s), see Amazon Managed Grafana service documentation. We recommend testing the newer version in a non-production environment before updating a production workspace.

We know this has been a frustration for many of you, and we deeply appreciate your understanding and patience. Your feedback and continued support are invaluable in helping us improve, and we're committed to ensure a smoother experience going forward.

If you have any questions or continue to experience issues, please reach out to us. Alternatively, if you prefer a more private channel, feel free to reach us via email at aws-grafana-feedback@amazon.com.

Public roadmap announcement

trinh-tien-dat commented 1 month ago

Hi Everyone,

10.4 is actually fixed the problem, I upgraded from 9.4 to 10.4, and I now receive 1 notification per alert, that's great!

Thank you to AWS Grafana Team.

FYI, the steps I took to upgrade from 9.4 to 10.4:

1- Create a new workspace 10.4 2- Use amazon-managed-grafana-migrator tool to migrate between the workspaces. 3- The tool does not sync alerts (rules, contact points, and notification policies) I think that makes sense, you just need to re-create alerts on the new workspace.

All good then!

NOTE: Please consider to direct upgrade from 9.4 to 10.4, I tried and failed.

Thanks! Dat.

VermaPriyanka commented 1 month ago

Closed by https://github.com/aws/amazon-managed-grafana-roadmap/discussions/85