Open dleard opened 3 years ago
Experimented with alerts in the form of:
count by(kube_namespace_name, kube_pod_name)
(changes(kube_pod_status_ready{condition="true"}[30m])) >= 5
and tried to alert a pod that would crashloopbackoff by design (a container that runs exit 1
as a command).
This didn't work.
cc @dleard
We should have an alert that monitors pod-restarts & sends an alert after a few pod restarts happen within a time threshold to alert us of a possible back-off restart loop.