Closed wendorf closed 1 year ago
@wendorf what about doing something like this
kptop nodes --check-metrics
kptop pods --check-metrics
it should print a table with all the metrics used for pods or nodes (includes the metrics used for graphs)
Metric | status | Comment |
---|---|---|
Metric_1 | 🟢 available |  |
Metric_2 | 🟡 not_available |  ... |
Metric_3 | 🟢 available |  |
update:
I think this is better
--verify-prometheus --check-metrics
# in addition to Prometheus connection verification output,
# it will print a table with all the metrics that kptop uses and and will show which metrics is missing.
Done, --verify-prometheus --check-metrics
will be available in the next release "v0.0.4"
Awesome! Thank you!
In the cluster I'm trying kptop with, we are filtering which cadvisor metrics we scrape. To get kptop working correctly, I need to run kptop's commands, see what graphs populate and don't populate, check the source code to see which metrics the graphs are looking at, then update the list of scraped metrics.
I would love if
--verify-prometheus
returned a list of metrics that were missing, so I could more-easily update the allowlist in my scrape config.It was easy to determine that I needed
machine_cpu_cores
, sincekptop nodes
fails noisily:However, for the dashboards, I need to dig through the logfile. It would be nice if all the failing queries were presented at once.