Closed StevenBarre closed 1 year ago
pr on cerberus: https://github.com/bcgov/platform-services-sre/pull/21 ccm pr: https://github.com/bcgov-c/platform-gitops-gen/pull/717 sysdig dashboard: can be found here
CPU and memory alerts will send notification to rc: https://app.sysdigcloud.com/#/alerts/rules?filter=kyverno&direction=desc&sortBy=modifiedOn
Describe the issue I discovered the Kyverno pods in Silver were crashlooping, but had not been notified of such.
What is the Value/Impact? Awareness of when this critical service is not healthy.
What is the plan? How will this get completed? Can we move this to
openshift-bcgov-kyverno
and get the free alertmanager rules? Should we set up some other kind of monitoring?Identify any dependencies Collab between Jason and AdvSol Ops
Definition of done Kyverno is properly monitored in all clusters its deployed in