There's something wrong with how we are calculating the mean time to restore for individual issues. I'm not sure why, but sometimes we can skip an issue.
Here's a readout from my prometheus:
min by (issue_number, service) (min_over_time(failure_resolution_timestamp{app=~".*pelorus-api.*"}[2d] @ 1710475200)) - min by (issue_number, service) (min_over_time(failure_creation_timestamp{app=~".*pelorus-api.*"}[2d] @ 1710475200))
OpenShift version
Not related to OpenShift
Problem description
There's something wrong with how we are calculating the mean time to restore for individual issues. I'm not sure why, but sometimes we can skip an issue.
Here's a readout from my prometheus:
These two queries should yield the same number of results, but they do not.
Steps to reproduce
Current behavior
See above
Expected behavior
See Above
Code of Conduct