datamade / how-to

📚 Doing all sorts of things, the DataMade way
MIT License
88 stars 12 forks source link

Choose a service for basic health monitoring of sites/apps #125

Closed hancush closed 3 years ago

hancush commented 3 years ago

Background

Migrated from https://github.com/datamade/devops/issues/78.

Proposal

Pick a service that offers red/green status on site availability, and point it at our production sites.

Deliverables

We've set a monitoring service up for our running apps.

Timeline

< 1 day

hancush commented 3 years ago

Heroku offers email alerts based on response time and failed requests.

Screen Shot 2020-12-17 at 3 53 50 PM

Screen Shot 2020-12-17 at 3 55 35 PM

Not sure if we'd necessarily want our alerts so tightly coupled with our deployment infrastructure, though. (If our sites are out because Heroku is down, then presumbly our alerts would be down, as well.)

hancush commented 3 years ago

Looked into a Sentry uptime monitor, but didn't find anything. Uptime Robot looks very straightforward. $7-8/month (annual vs. monthly billing) for 50 sites with what looks like a nice, simple UI and a Slack integration. Panopta looks like it offers a lot more functionality than we need/would use.

hancush commented 3 years ago

I created a free account on UptimeRobot and moved the credentials into our team LastPass. I set up an HTTP monitor for app.dedupe.io and configured a Slack alert for up/down events. I'll switch that over to only down events once I can confirm the alerts are coming through as expected. If we decide we like UptimeRobot, I'll add monitors for our other sites.

hancush commented 3 years ago

Added our dynamic sites on AWS and Heroku to UptimeRobot.