packit / status

https://status.packit.dev website content
https://status.packit.dev
MIT License
0 stars 9 forks source link

Investigate if we can use PagerDuty or UptimeRobot instead of cState #157

Open Venefilyn opened 1 month ago

Venefilyn commented 1 month ago

Description

To get automatic monitoring and notifications for downtime instead of user reports et. al., having something that is hosted would be a better solution than relying on a static page like cState

cState itself is great and offers a very simple way of keeping statuses up to date, but we miss the monitoring and outages are usually seen by users first and not updated on the status page for a while

It might be possible for us to utilize UptimeRobot, and I noticed that IBM already uses it according to the frontpage, so maybe we could utilize their enterprise account to save money (if we need a paid account). I know internally in Red Hat we utilize PagerDuty as well as Atlassian Statuspage for https://status.redhat.com

Benefit

Faster outage reporting, live notifications for the team on detected downtime (like PagerDuty)

Importance

Minor, lots of time spent to change for little gain

Workaround

Participation

lbarcziova commented 1 month ago

Just one correction, we already are using UptimeRobot, it is linked from the status page, see. Maybe we can investigate if we could extend on it further, because currently, it does monitor whether the service and dashboard is up, but nothing more, so it isn't really reflecting general state of the whole service.

Venefilyn commented 1 month ago

Maybe we can investigate if we could extend on it further, because currently, it does monitor whether the service and dashboard is up, but nothing more, so it isn't really reflecting general state of the whole service.

It would be cool. Especially if we can read status pages of related services like Copr, Koji, Testing Farm, and indicate that there may be disruptions in Packit too

It would just be nice to have some sort of automation with the ability to create incidents or planned maintenance