sustainable-computing-io / kepler

Kepler (Kubernetes-based Efficient Power Level Exporter) uses eBPF to probe performance counters and other system stats, use ML models to estimate workload energy consumption based on these stats, and exports them as Prometheus metrics
https://sustainable-computing.io
Apache License 2.0
1.07k stars 169 forks source link

idle power calculations for container metrics don't look right #1194

Open novacain1 opened 6 months ago

novacain1 commented 6 months ago

What happened?

I have a workload that runs on the realtime kernel, and previously was using cgroup metrics in Kepler for estimating energy consumption. With that workload back in May of this year, I was seeing around 73W usage for my energy hungry cluster, and 48.6 W usage in my energy efficient cluster. The lower wattage usage was because I was using energy savings features like C-States, P states, and per pod power management. The 73W usage was in a cluster running wide open (no c-states, max frequency, all cores idle=poll which uses way more energy). Kepler was very helpful for me to understand how much energy I was using at the container level.

With the latest version, I wasn't even getting ebpf metrics (https://github.com/sustainable-computing-io/kepler/issues/1175), with the realtime kernel, which has since been fixed with an image from https://github.com/sustainable-computing-io/kepler/pull/1185.

In using the same workload from May, with the latest 1185 image, I see unexpected results with the idle energy consumption from the workload namespace. I see 12.3W in my energy hungry cluster, and 47.6 W in my energy efficient cluster. This does not make sense to me.

Idle power

$ kubectl exec -ti -n openshift-user-workload-monitoring prometheus-user-workload-0 -- bash -c 'curl "localhost:9090/api/v1/query?query=kepler_container_package_joules_total[30s]"' |jq -r '.data.result[] | [.metric.pod_name, .metric.container_name,.metric.container_namespace, .metric.mode, .metric.namespace, .values[0][0], (.values[0][1]|tonumber)] | @csv' |sort -k 7 -g -t,  |tail -10
"node-exporter-278v2","node-exporter","openshift-monitoring","idle","kepler",1705960722.598,34316.274
"sriov-fec-controller-manager-665dd7559c-qqgfh","manager","vran-acceleration-operators","idle","kepler",1705960722.598,36471.069
"endpoint-observability-operator-844fb7844-wbtdf","endpoint-observability-operator","open-cluster-management-addon-observability","idle","kepler",1705960722.598,37860.504
"metrics-collector-deployment-6fb4467fb5-cdjgp","metrics-collector","open-cluster-management-addon-observability","idle","kepler",1705960722.598,95436.36
"uwl-metrics-collector-deployment-5fc6d78687-cxghp","metrics-collector","open-cluster-management-addon-observability","idle","kepler",1705960722.598,97777.317
"sriov-fec-daemonset-pjstd","sriov-fec-daemon","vran-acceleration-operators","idle","kepler",1705960722.598,108082.143
"kepler-exporter-7krg2","kepler-exporter","kepler","idle","kepler",1705960722.598,113516.889
"flexran-binary-release-7fb6596b4d-cn292","flexran-l1app","flexranl1","dynamic","kepler",1705960722.598,235365.978
"kernel_processes","kernel_processes","kernel","dynamic","kepler",1705960722.598,407252.856
"kernel_processes","kernel_processes","kernel","idle","kepler",1705960722.598,1415208.507

Dynamic power

$ kubectl exec -ti -n openshift-user-workload-monitoring prometheus-user-workload-0 -- bash -c 'curl "localhost:9090/api/v1/query?query=kepler_container_package_joules_total[30s]"' |jq -r '.data.result[] | [.metric.pod_name, .metric.container_name,.metric.container_namespace, .metric.mode, .metric.namespace, .values[0][0], (.values[0][1]|tonumber)] | @csv' |sort -k 7 -g -t, |grep -v idle |tail -10
"apiserver-767bc658c8-xnhxf","oauth-apiserver","openshift-oauth-apiserver","dynamic","kepler",1705960752.598,167.538
"metrics-collector-deployment-6fb4467fb5-cdjgp","metrics-collector","open-cluster-management-addon-observability","dynamic","kepler",1705960752.598,185.478
"prometheus-k8s-0","prometheus-proxy","openshift-monitoring","dynamic","kepler",1705960752.598,186.642
"uwl-metrics-collector-deployment-5fc6d78687-cxghp","metrics-collector","open-cluster-management-addon-observability","dynamic","kepler",1705960752.598,254.625
"sriov-network-config-daemon-8nwd6","sriov-network-config-daemon","openshift-sriov-network-operator","dynamic","kepler",1705960752.598,259.461
"cluster-version-operator-59f77f8cbd-26jdh","cluster-version-operator","openshift-cluster-version","dynamic","kepler",1705960752.598,274.317
"kepler-exporter-7krg2","kepler-exporter","kepler","dynamic","kepler",1705960752.598,766.962
"kube-apiserver-gnb.flexran2.cars2.lab","kube-apiserver","openshift-kube-apiserver","dynamic","kepler",1705960752.598,6047.577
"flexran-binary-release-7fb6596b4d-cn292","flexran-l1app","flexranl1","dynamic","kepler",1705960752.598,235648.833
"kernel_processes","kernel_processes","kernel","dynamic","kepler",1705960752.598,407756.142

Output captured from Kepler in June 2023: image

Ouput captured from Kepler in January 2024. My concern is the container metrics which appear at the bottom of this picture captured from my Grafana instance: image

metrics being used here are: kepler_container_package_joules_total and kepler_container_dram_joules_total.

thanks in advance for your help.

What did you expect to happen?

Power usage to be roughly the same from Kepler usage (older version) back from June.

How can we reproduce it (as minimally and precisely as possible)?

Have been working with @rootfs on this, but I think conceivably one could deploy OpenShift using cgroups v1 and enabling the realtime kernel with the PerformanceProfile CR and see this behavior.

Anything else we need to know?

No response

Kepler image tag

quay.io/sustainable_computing_io/kepler:pr-1185

Kubernetes version

```console $ kubectl version Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.4", GitCommit:"33a7a8bcccdd1c7c0e2f51609d832d31232d2f26", GitTreeState:"clean", BuildDate:"2023-12-13T20:37:53Z", GoVersion:"go1.20.10 X:strictfipsruntime", Compiler:"gc", Platform:"linux/amd64"} Kustomize Version: v5.0.1 Server Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.9+e36e183", GitCommit:"04c94a756a2e2e0eab90f1fed560b464544044dc", GitTreeState:"clean", BuildDate:"2024-01-12T04:36:55Z", GoVersion:"go1.20.12 X:strictfipsruntime", Compiler:"gc", Platform:"linux/amd64"} ```

Cloud provider or bare metal

baremetal

OS version

```console # On Linux: $ cat /etc/os-release NAME="Red Hat Enterprise Linux CoreOS" ID="rhcos" ID_LIKE="rhel fedora" VERSION="414.92.202401121330-0" VERSION_ID="4.14" VARIANT="CoreOS" VARIANT_ID=coreos PLATFORM_ID="platform:el9" PRETTY_NAME="Red Hat Enterprise Linux CoreOS 414.92.202401121330-0 (Plow)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:redhat:enterprise_linux:9::coreos" HOME_URL="https://www.redhat.com/" DOCUMENTATION_URL="https://docs.openshift.com/container-platform/4.14/" BUG_REPORT_URL="https://bugzilla.redhat.com/" REDHAT_BUGZILLA_PRODUCT="OpenShift Container Platform" REDHAT_BUGZILLA_PRODUCT_VERSION="4.14" REDHAT_SUPPORT_PRODUCT="OpenShift Container Platform" REDHAT_SUPPORT_PRODUCT_VERSION="4.14" OPENSHIFT_VERSION="4.14" RHEL_VERSION="9.2" OSTREE_VERSION="414.92.202401121330-0" $ uname -a Linux gnb.flexran2.cars2.lab 5.14.0-284.48.1.rt14.333.el9_2.x86_64 #1 SMP PREEMPT_RT Thu Jan 4 13:55:53 EST 2024 x86_64 x86_64 x86_64 GNU/Linux ```

Install tools

Kepler deployment config

For on kubernetes: ```console $ KEPLER_NAMESPACE=kepler # provide kepler configmap $ kubectl get configmap kepler-cfm -n ${KEPLER_NAMESPACE} # paste output here # provide kepler deployment description $ kubectl describe deployment kepler-exporter -n ${KEPLER_NAMESPACE} ``` For standalone: # put your Kepler command argument here

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

marceloamaral commented 6 months ago

Hi @novacain1, just to understand better the issue:

novacain1 commented 6 months ago

Hi @marceloamaral thank you for your help. Answers to your questions:

With that query, I get marginal difference in energy usage at 1m, nothing substantial. I suspect I am using a longer poll period from prometheus (OpenShift Observability Add-On), so 30s actually returns no data.

rootfs commented 6 months ago

This is a question for @rootfs : is the image using the code that was refactored?

yes, this image has fix in #1185

rootfs commented 6 months ago

Regarding the dashboard, the green one is the efficient node while the red one is the power hungry node. The node metrics looks right (green one is ~29% more efficient than red one) but the pod level metrics is not right: the green one uses more power than the red one.

rootfs commented 6 months ago

@marceloamaral this needs some thinking, the idle power formula uses the cpu time that we know has some problem

https://github.com/sustainable-computing-io/kepler/blob/main/pkg/collector/stats/node_stats.go#L64-L68

func (ne *NodeStats) UpdateIdleEnergyWithMinValue(isComponentsSystemCollectionSupported bool) {
    // gpu metric
    if config.EnabledGPU && gpu.IsGPUCollectionSupported() {
        ne.CalcIdleEnergy(config.AbsEnergyInGPU, config.IdleEnergyInGPU, config.GPUSMUtilization)
    }

    if isComponentsSystemCollectionSupported {
        ne.CalcIdleEnergy(config.AbsEnergyInCore, config.IdleEnergyInCore, config.CPUTime)
        ne.CalcIdleEnergy(config.AbsEnergyInDRAM, config.IdleEnergyInDRAM, config.CPUTime) // TODO: we should use another resource for DRAM
        ne.CalcIdleEnergy(config.AbsEnergyInUnCore, config.IdleEnergyInUnCore, config.CPUTime)
        ne.CalcIdleEnergy(config.AbsEnergyInPkg, config.IdleEnergyInPkg, config.CPUTime)
        ne.CalcIdleEnergy(config.AbsEnergyInPlatform, config.IdleEnergyInPlatform, config.CPUTime)
    }
}
rootfs commented 6 months ago

For reference, the image used in last June had this commit that disabled idle power