CatalystCode / kubemalt

Kubernetes guidance for MALT (Monitoring, Alerting, Logging, and Tracing)
6 stars 2 forks source link

Prometheus Node Health Alerting #41

Open dtzar opened 5 years ago

dtzar commented 5 years ago

As an operator of my K8s cluster agents, I'd like to ensure that the cluster is healthy and be notified/alerted if something is broken or about to break.

This could mean Prometheus + Grafana provides alert rules and dashboards with configurable notifications going to social channel of choice.

There are already a ton of alert rules here for Prometheus which can be utilized without the Prometheus operator and links to relevant Grafana dashboards too from that article. https://github.com/coreos/prometheus-operator/tree/master/contrib/kube-prometheus