aws / amazon-managed-grafana-roadmap

Amazon Managed Grafana Roadmap
Other
57 stars 4 forks source link

AWS managed grafana fires a notification 3 times #55

Open trinh-tien-dat opened 12 months ago

trinh-tien-dat commented 12 months ago

Hi Team, I'm using AWS managed Grafana service.

When I use Notification policies to route my alerts to difference Contact points, when an alert rule fires, I get 3 similar notifications at the same time.

I ran into some search and it seems it is a bug of AWS Managed Grafana service.

is there anyone can fix it by yourself? or any AWS Guys in this thread can tell me why it happens and when you can fix it?

Thank you!

rpractice commented 11 months ago

Yes Same for us as well, seems like a bug in AMG

trinh-tien-dat commented 11 months ago

@rpractice Hi, I'm using prometheus alertmanager as an external alertmanger instead of Grafana default alertmanager to take care of the duplicating, I will let you know the result in couple days.

rpractice commented 11 months ago

@trinh-tien-dat , Did you manage to fix this issue using Prometheus?

trinh-tien-dat commented 11 months ago

yes, it worked, what I have done is that I installed prometheus-alertmanager on an EC2 instance, then I config it to route incoming alerts to AWS SNS, prometheus-alertmanager groups the alerts and de-duplicated them. Then, I go to AWS Grafana service --> contact point --> alertmanager --> I set up the prometheus-alertmanager url and an account to authen.

It worked well, we can use this when we wait for them to fix it.

brc commented 11 months ago

Is this the behavior that's documented in the second bullet point here? https://docs.aws.amazon.com/grafana/latest/userguide/v9-alerts.html#v9-alert-limitations

trinh-tien-dat commented 11 months ago

@invsblduck Hi, I'm not sure if it relates to my issue.

chr2che commented 11 months ago

is there an ETA for this to be fixed ?

brc commented 11 months ago

@trinh-tien-dat Apologies that I phrased it as a question for you. This behavior is documented as a limitation at the link I shared:

Alert rules defined in Grafana, rather than in Prometheus, send multiple notifications to your contact point.

This is also a duplicate of #47

trinh-tien-dat commented 11 months ago

There is an interesting thing, on AWS Grafana service, when you define some rules and config only 1 contact point (It's also the default contact point) for the rules, then there is no duplication notifications, it works perfect (I have 1 contact point AWS SNS). Then, you have more contact points (>= 2), at this time you have to define Notification policies to route the alerts to right contact point as you wish, there the duplication issue occurs, I tried many times from my AWS grafana service, and I could say: 1 contact point for all rules: worked multiple contact points, each contact point for some of the rules: you will have the issue. Thanks! Dat.

lorelei-rupp-imprivata commented 9 months ago

This seems like an unfortunate issue that AMG has and AMG should solve it. While they do call it out its expected to me that is just a known bug. AMG shouldn't limit you from using the new alerting feature if you want to have your alerts in grafana vs Prometheus, esp if you are using other data sources that are not Prometheus you may want to have grafana alerts. This really limits you from using the new alerting features in grafana that are way more powerful then the legacy ones

0x416e746f6e commented 5 months ago

f.w.i.w. we ended up with writing our own lambda (for slack alerts) that does deduplication. alerts that are coming in triplets are not just ridiculous, but also encourage people stopping paying attention to them.

https://github.com/flashbots/prometheus-sns-lambda-slack

webertrlz commented 4 months ago

+1 this is PITA.

MaGaudin commented 3 weeks ago

The Bug we are talking about is still there even withManaged Grafana 10.4 new version. @mhausenblas are you kindly able to tell us when it will be solved?

VermaPriyanka commented 3 weeks ago

@MaGaudin The bug fix is in progress and will be available in Managed Grafana version 10.4. See my response here

MaGaudin commented 3 weeks ago

Hi @VermaPriyanka. We updated to Grafana 10.4 but the bug is still there.

VermaPriyanka commented 3 weeks ago

Like I mentioned the work is still in progress. Will post an update here, once its rolled out.

MaGaudin commented 3 weeks ago

Ok Thanks for clarifying. Hope this will come asap being a vary nasty bug that can bring users to change product. In the meantime, have you any advise on some workarounds? What Can we do to mitigate this problem?

MaGaudin commented 3 weeks ago

There is an interesting thing, on AWS Grafana service, when you define some rules and config only 1 contact point (It's also the default contact point) for the rules, then there is no duplication notifications, it works perfect (I have 1 contact point AWS SNS). Then, you have more contact points (>= 2), at this time you have to define Notification policies to route the alerts to right contact point as you wish, there the duplication issue occurs, I tried many times from my AWS grafana service, and I could say: 1 contact point for all rules: worked multiple contact points, each contact point for some of the rules: you will have the issue. Thanks! Dat.

Unfortunately, With Grafana 10.4, this solution does not seem to work anymore. Has anyone else noticed the same?

lorelei-rupp-imprivata commented 3 weeks ago

Ok Thanks for clarifying. Hope this will come asap being a vary nasty bug that can bring users to change product. In the meantime, have you any advise on some workarounds? What Can we do to mitigate this problem?

We have been waiting just about a year for a fix... and none yet