As a developer
I want to receive alerts on app metrics via PagerDuty and Slack
So that I can respond appropriately
(This card is the first of setting up alerts and will set up one initial alert with the two alerting systems. After this, #152 covers the other metrics we want to alert on.)
Acceptance Criteria
GIVEN an alert notifies the team about a concern
WHEN the developer receives the notification via PagerDuty
THEN they are informed enough to act on the notification.
GIVEN an alert notifies the team about a concern
THEN an alert is sent via Slack.
Checklist:
[x] DataDog Set Up
[x] Create a dashboard of with the initial metric, using terraform
[x] Configure initial alert: Measure whether the heath check is returning success or failure. On failure it should give an alert on both PagerDuty and Slack
[x] PagerDuty
[x] Get access from VA - contact DOTS team
[x] Set up team and escalations of who should be notified (not setting up on-call schedule at this time)
[x] All team members add their notification preference (email/text)
~For now, we will set up alerts with CloudWatch and move to DataDog later.~ Since we want to move to DataDog eventually and have the alerts all in one place, we should set them up in DataDog now instead of putting work into CloudWatch first and migrating.
Additional Info/Resources
Research from the spike #151 in Google doc, explaining the reasoning and how to set up each alert.
Value Statement
As a developer I want to receive alerts on app metrics via PagerDuty and Slack So that I can respond appropriately
(This card is the first of setting up alerts and will set up one initial alert with the two alerting systems. After this, #152 covers the other metrics we want to alert on.)
Acceptance Criteria
GIVEN an alert notifies the team about a concern WHEN the developer receives the notification via PagerDuty THEN they are informed enough to act on the notification.
GIVEN an alert notifies the team about a concern THEN an alert is sent via Slack.
Checklist:
Assumptions
Additional Info/Resources
vanotify-team
repoOut of Scope
Open Questions