arachnys / cabot

Self-hosted, easily-deployable monitoring and alerts service - like a lightweight PagerDuty
MIT License
5.59k stars 594 forks source link

Question about the cabot-alert-email plugin #576

Open jchoksi-whitbread opened 6 years ago

jchoksi-whitbread commented 6 years ago

We configured Cabot to use AWS's SES service for sending e-mail.

AWS's SES service has a hard limit of a maximum of 50 recipients for every message you send.

We recently added a service to Cabot which had quite a few Cabot users setup to be alerted. Say around 30 people.

The single HTTP check in this service, was configured to have an importance of CRITICAL.

When this service triggered an alert, an e-mail failed to be sent out.

We saw a SMTPDataError: (554, 'Transaction failed: Recipient count exceeds 50.') error in Cabot's worker logs.

We couldn't see how the e-mail was being sent out to >=50 recipients until we came across the following code in: https://github.com/arachnys/cabot-alert-email/blob/master/cabot_alert_email/models.py#L39

emails += [u.email for u in users if u.email]

That code seems to be duplicating the list of email addresses already gathered in line 29.

Could you please confirm whether line 39 is doing something in error?

Edit: FYI. we worked around the issue by setting the HTTP check's importance to be ERROR.

dbuxton commented 6 years ago

Looks like a simple error. I have no idea what that line is meant to be doing! @frankh ?

JeanFred commented 6 years ago

This looks like it was introduced in https://github.com/cabotapp/cabot-alert-email/commit/2d499fd22381aa72bb1ec669b4bcb9f973091bf4 − it does make more sense to be adding the duty officers.

P.S. I’ll be deleting the arachys/cabot-alert-email fork as the cabotapp one should be the source of truth, and it’s confusing

JeanFred commented 6 years ago

I submitted https://github.com/cabotapp/cabot-alert-email/pull/5 which should fix it

JeanFred commented 6 years ago

Additionally, the plugin should be deduplicating redundant emails by using a set − this would have mitigated the issue @jchoksi-whitbread encountered. Will also push a fix.

whysthatso commented 5 years ago

@jchoksi-whitbread are you still experiencing this issue? i am also considering aws and would like to understand if this can be closed.