bcgov / DITP-DevOps

Digital Identity and Trust Program Team's DevOps Documentation Repository
Apache License 2.0
2 stars 6 forks source link

Monitoring and Tooling Review #184

Open esune opened 6 months ago

esune commented 6 months ago

After recent outage events (VC-AuthN) of our services, we need to re-assess whether we are monitoring everything we need to be proactive in preventing this type of situations, and where the monitoring needs to happen.

In particular, items we are interested in keeping an eye on are:

Additionally, we want to assess whether the list of services for which we track uptime/availability is complete, or we need to add further endpoints to our Uptime dashboards.

Desired outcomes:

Related issues: