nautobot / nautobot-app-circuit-maintenance

Circuit Maintenance App for Nautobot
https://docs.nautobot.com/projects/circuit-maintenance/en/latest/
Apache License 2.0
31 stars 8 forks source link

Add Ability to Track Outages #226

Open jdrew82 opened 2 years ago

jdrew82 commented 2 years ago

Environment

Proposed Functionality

It'd be helpful if we could have a historical record of outages for individual circuits. We could even include a trend for outages in a given time period. This would be helpful for SLA purposes.

Use Case

In order to know if an SLA is being met it's helpful to know outages that occurred in a time period. This is also helpful for troubleshooting purposes for historical purposes.

chadell commented 2 years ago

could you define "outage"? I mean, what do you understand by an outage?

jdrew82 commented 2 years ago

I believe the thinking was that any circuit maintenance notification that's received could count as an outage. Basically, an outage is anytime a circuit goes down. If we wanted to narrow it down we could potentially add some logic to check the notification for a reason and if it says something like "line cut" or the like, count that as an outage. Definitely could use some fleshing out of what this would entail.

chadell commented 2 years ago

gotten

You have in the database the full history of Circuit Maintenance, connected to each Circuit via a CircuitImpact. So, the information is available. It is also exposed via Prometheus metrics, so you could track it in a TSDB.

I think that the trend would be something we could connect via the TSDB, but we could add a Job that creates a report of this information, grouped by Circuit, Circuit Type, Provider, Site, etc.

How does it sound?

jdrew82 commented 2 years ago

It could work if it's in the style that Golden Config and Device Lifecycle have.

scetron commented 1 year ago

That should be doable as both should show as simply counts fed to a graph.