Open pasztorl opened 1 year ago
Looks like this may be due to a missing 'node' attribute in the node_cpu_seconds_total metric.
I was running into this on EKS, kubernetes version 1.28.1, and I was able to fix it myself by adding this to the values of the kube-prometheus-stack helm deployment:
prometheus-node-exporter:
prometheus:
monitor:
attachMetadata:
node: true
relabelings:
- sourceLabels:
- __meta_kubernetes_endpoint_node_name
targetLabel: node
action: replace
regex: (.+)
replacement: ${1}
I believe this bug actually lies in the prometheus-operator config reloader, since this is a default that should be included with the expansion of the servicemonitor configuration. In any case, this works like a charm and now 'kubectl top nodes' works with the prometheus adapter (which i had to install separately from the kube-prometheus-stack, despite the README saying that it's included)
@ryanobjc Thanks a lot for your answer! Had the same issue with EKS as well.
Describe the bug a clear and concise description of what the bug is.
I've installed kube-prometheus-stack, then prometheus-adapter. Then kubectl top pods works, but kubectl top node says "metrics not available yet"
in the log:
and also:
What's your helm version?
not related
What's your kubectl version?
not related
Which chart?
prometheus-adapter
What's the chart version?
4.2.0
What happened?
No response
What you expected to happen?
No response
How to reproduce it?
No response
Enter the changed values of values.yaml?
rules: resource: cpu: containerQuery: | sum by (<<.GroupBy>>) ( rate(container_cpu_usage_seconds_total{container!="",<<.LabelMatchers>>}[3m]) ) nodeQuery: | sum by (<<.GroupBy>>) ( rate(node_cpu_seconds_total{mode!="idle",mode!="iowait",mode!="steal",<<.LabelMatchers>>}[3m]) ) resources: overrides: node: resource: node namespace: resource: namespace pod: resource: pod containerLabel: container memory: containerQuery: | sum by (<<.GroupBy>>) ( avg_over_time(container_memory_working_set_bytes{container!="",<<.LabelMatchers>>}[3m]) ) nodeQuery: | sum by (<<.GroupBy>>) ( avg_over_time(node_memory_MemTotal_bytes{<<.LabelMatchers>>}[3m])
Enter the command that you execute and failing/misfunctioning.
installed with ansible helm module (not related)
Anything else we need to know?
No response