Open jdrew82 opened 2 years ago
could you define "outage"? I mean, what do you understand by an outage?
I believe the thinking was that any circuit maintenance notification that's received could count as an outage. Basically, an outage is anytime a circuit goes down. If we wanted to narrow it down we could potentially add some logic to check the notification for a reason and if it says something like "line cut" or the like, count that as an outage. Definitely could use some fleshing out of what this would entail.
gotten
You have in the database the full history of Circuit Maintenance, connected to each Circuit via a CircuitImpact
. So, the information is available. It is also exposed via Prometheus metrics, so you could track it in a TSDB.
I think that the trend would be something we could connect via the TSDB, but we could add a Job that creates a report of this information, grouped by Circuit, Circuit Type, Provider, Site, etc.
How does it sound?
It could work if it's in the style that Golden Config and Device Lifecycle have.
That should be doable as both should show as simply counts fed to a graph.
Environment
Proposed Functionality
It'd be helpful if we could have a historical record of outages for individual circuits. We could even include a trend for outages in a given time period. This would be helpful for SLA purposes.
Use Case
In order to know if an SLA is being met it's helpful to know outages that occurred in a time period. This is also helpful for troubleshooting purposes for historical purposes.