intel / cluster-management-toolkit

Toolkit for managing and monitoring Kubernetes clusters; includes a Curses-based console UI as well as a few command-line tools.
MIT License
9 stars 3 forks source link

Integrate health monitoring for nodes #8

Open taotriad opened 1 year ago

taotriad commented 1 year ago

The cluster overview should include a heatmap with node health (temperature, load, etc.) This needs to be out-of-band. Knowing whether node failure is due to the node or due to Kubernetes is important when debugging things.