Open jharriga opened 6 months ago
@jharriga can you double check if it is kepler_node_core_joules_total or kepler_node_package_joules_total?
Current Ampere xgene hwmon only reports the CPU and I/O power (per doc here). We cannot get DRAM power. So to align with the RAPL reporting, kepler only reports kepler_node_core_total (per code here)
This was originally reported on x86. Running with v0.7.10 Running w/v0.7.10 on x86 I do see the metric kepler-node-core-joules-total does have value root# curl localhost:8888/metrics | grep kepler_node_core_joules_total
kepler_node_core_joules_total{instance="nuc7",mode="dynamic",package="0",source="intel_rapl"} 39.07
kepler_node_core_joules_total{instance="nuc7",mode="idle",package="0",source="intel_rapl"} 61360.029
As for ARM, on Ampere server running v0.7.10 I see:
kepler_node_core_joules_total{instance="perf-arm-11.perf.eng.bos2.dc.redhat.com",mode="dynamic",package="0",source="intel_rapl"} 98036.551
kepler_node_package_joules_total{instance="perf-arm-11.perf.eng.bos2.dc.redhat.com",mode="dynamic",package="0",source="intel_rapl"} 98057.08
Both the kepler_node_core_joules_total and kepler_node_package_joules_total metrics do have a values. This doesn't seem to align with what you expected in the previous comment.
At any rate I think this Issue can be CLOSED since the originally reported problem on x86 appears to have been resolved.
What happened?
Downloaded and installed
https://github.com/sustainable-computing-io/kepler/releases/download/v0.7.9/kepler.rpm.tar.gz
On server running
Ran several CPU intensive workloads and metric remained '0'
What did you expect to happen?
expected the metric reading to increase/track system cpu usage
How can we reproduce it (as minimally and precisely as possible)?
Download & install rpm start service root# systemctl start container-kepler --now root# curl localhost:8888/metrics | grep
Anything else we need to know?
No response
Kepler image tag
Kubernetes version
Cloud provider or bare metal
OS version
Install tools
Kepler deployment config
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)