sustainable-computing-io / kepler

Kepler (Kubernetes-based Efficient Power Level Exporter) uses eBPF to probe performance counters and other system stats, use ML models to estimate workload energy consumption based on these stats, and exports them as Prometheus metrics
https://sustainable-computing.io
Apache License 2.0
1.19k stars 184 forks source link

Kepler latest produces inconsistent process level values compared to v0.7.12 #1854

Open vprashar2929 opened 1 week ago

vprashar2929 commented 1 week ago

What happened?

When Kepler using v0.7.12 and latest is deployed independently on a BM and stress is applied to the machine Kepler using latest reports inconsistent values where as Kepler using v0.7.12 always reports consistent values

Attaching screenshot for reference: When Kepler v0.7.12 is deployed:

New-dashboard-Dashboards-Grafana

image

When Kepler using latest code base is deployed:

New-dashboard-Dashboards-Grafana(1)

image

What did you expect to happen?

Kepler process shouldn't deviate much from the stable release

How can we reproduce it (as minimally and precisely as possible)?

  1. Checkout v0.7.12 code base and latest code base separately
  2. Deploy Kepler using dev compose manifests
  3. Compare kepler_process_<pkg|core|dram|other>_joules_total between dev and latest. Note: Dev and latest will point to the same Kepler version

Anything else we need to know?

No response

Kepler image tag

v0.7.12, latest

Kubernetes version

```console $ kubectl version # paste output here ```

Cloud provider or bare metal

BM

OS version

```console # On Linux: $ cat /etc/os-release # paste output here $ uname -a # paste output here # On Windows: C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture # paste output here ```

Install tools

Kepler deployment config

For on kubernetes: ```console $ KEPLER_NAMESPACE=kepler # provide kepler configmap $ kubectl get configmap kepler-cfm -n ${KEPLER_NAMESPACE} # paste output here # provide kepler deployment description $ kubectl describe deployment kepler-exporter -n ${KEPLER_NAMESPACE} ``` For standalone: # put your Kepler command argument here

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)