Open Venefilyn opened 1 month ago
Just one correction, we already are using UptimeRobot, it is linked from the status page, see. Maybe we can investigate if we could extend on it further, because currently, it does monitor whether the service and dashboard is up, but nothing more, so it isn't really reflecting general state of the whole service.
Maybe we can investigate if we could extend on it further, because currently, it does monitor whether the service and dashboard is up, but nothing more, so it isn't really reflecting general state of the whole service.
It would be cool. Especially if we can read status pages of related services like Copr, Koji, Testing Farm, and indicate that there may be disruptions in Packit too
It would just be nice to have some sort of automation with the ability to create incidents or planned maintenance
Description
To get automatic monitoring and notifications for downtime instead of user reports et. al., having something that is hosted would be a better solution than relying on a static page like cState
cState itself is great and offers a very simple way of keeping statuses up to date, but we miss the monitoring and outages are usually seen by users first and not updated on the status page for a while
It might be possible for us to utilize UptimeRobot, and I noticed that IBM already uses it according to the frontpage, so maybe we could utilize their enterprise account to save money (if we need a paid account). I know internally in Red Hat we utilize PagerDuty as well as Atlassian Statuspage for https://status.redhat.com
Benefit
Faster outage reporting, live notifications for the team on detected downtime (like PagerDuty)
Importance
Minor, lots of time spent to change for little gain
Workaround
Participation