eslam-gomaa / kptop

A Python tool that offers beautiful CLI monitoring based on Prometheus metrics, with Kubernetes integration through PodPortForward
https://eslam-gomaa.github.io/kptop/
GNU General Public License v3.0
277 stars 11 forks source link

`--verify-prometheus` should check metrics availability #15

Closed wendorf closed 1 year ago

wendorf commented 1 year ago

In the cluster I'm trying kptop with, we are filtering which cadvisor metrics we scrape. To get kptop working correctly, I need to run kptop's commands, see what graphs populate and don't populate, check the source code to see which metrics the graphs are looking at, then update the list of scraped metrics.

I would love if --verify-prometheus returned a list of metrics that were missing, so I could more-easily update the allowlist in my scrape config.

It was easy to determine that I needed machine_cpu_cores, since kptop nodes fails noisily:

# kptop nodes
No nodes found
Query did not return any data: machine_cpu_cores

However, for the dashboards, I need to dig through the logfile. It would be nice if all the failing queries were presented at once.

eslam-gomaa commented 1 year ago

@wendorf what about doing something like this

kptop nodes --check-metrics

kptop pods --check-metrics

it should print a table with all the metrics used for pods or nodes (includes the metrics used for graphs)

Metric status Comment
Metric_1 🟢 available  
Metric_2 🟡 not_available  ...
Metric_3 🟢 available  

update:

I think this is better

--verify-prometheus --check-metrics 
# in addition to Prometheus connection verification output, 
# it will print a table with all the metrics that kptop uses and and will show which metrics is missing. 
eslam-gomaa commented 1 year ago

Done, --verify-prometheus --check-metrics will be available in the next release "v0.0.4"

16

wendorf commented 1 year ago

Awesome! Thank you!