idle power calculations for container metrics don't look right

novacain1 commented 6 months ago

What happened?

I have a workload that runs on the realtime kernel, and previously was using cgroup metrics in Kepler for estimating energy consumption. With that workload back in May of this year, I was seeing around 73W usage for my energy hungry cluster, and 48.6 W usage in my energy efficient cluster. The lower wattage usage was because I was using energy savings features like C-States, P states, and per pod power management. The 73W usage was in a cluster running wide open (no c-states, max frequency, all cores idle=poll which uses way more energy). Kepler was very helpful for me to understand how much energy I was using at the container level.

With the latest version, I wasn't even getting ebpf metrics (https://github.com/sustainable-computing-io/kepler/issues/1175), with the realtime kernel, which has since been fixed with an image from https://github.com/sustainable-computing-io/kepler/pull/1185.

In using the same workload from May, with the latest 1185 image, I see unexpected results with the idle energy consumption from the workload namespace. I see 12.3W in my energy hungry cluster, and 47.6 W in my energy efficient cluster. This does not make sense to me.

Idle power

$ kubectl exec -ti -n openshift-user-workload-monitoring prometheus-user-workload-0 -- bash -c 'curl "localhost:9090/api/v1/query?query=kepler_container_package_joules_total[30s]"' |jq -r '.data.result[] | [.metric.pod_name, .metric.container_name,.metric.container_namespace, .metric.mode, .metric.namespace, .values[0][0], (.values[0][1]|tonumber)] | @csv' |sort -k 7 -g -t,  |tail -10
"node-exporter-278v2","node-exporter","openshift-monitoring","idle","kepler",1705960722.598,34316.274
"sriov-fec-controller-manager-665dd7559c-qqgfh","manager","vran-acceleration-operators","idle","kepler",1705960722.598,36471.069
"endpoint-observability-operator-844fb7844-wbtdf","endpoint-observability-operator","open-cluster-management-addon-observability","idle","kepler",1705960722.598,37860.504
"metrics-collector-deployment-6fb4467fb5-cdjgp","metrics-collector","open-cluster-management-addon-observability","idle","kepler",1705960722.598,95436.36
"uwl-metrics-collector-deployment-5fc6d78687-cxghp","metrics-collector","open-cluster-management-addon-observability","idle","kepler",1705960722.598,97777.317
"sriov-fec-daemonset-pjstd","sriov-fec-daemon","vran-acceleration-operators","idle","kepler",1705960722.598,108082.143
"kepler-exporter-7krg2","kepler-exporter","kepler","idle","kepler",1705960722.598,113516.889
"flexran-binary-release-7fb6596b4d-cn292","flexran-l1app","flexranl1","dynamic","kepler",1705960722.598,235365.978
"kernel_processes","kernel_processes","kernel","dynamic","kepler",1705960722.598,407252.856
"kernel_processes","kernel_processes","kernel","idle","kepler",1705960722.598,1415208.507

Dynamic power

$ kubectl exec -ti -n openshift-user-workload-monitoring prometheus-user-workload-0 -- bash -c 'curl "localhost:9090/api/v1/query?query=kepler_container_package_joules_total[30s]"' |jq -r '.data.result[] | [.metric.pod_name, .metric.container_name,.metric.container_namespace, .metric.mode, .metric.namespace, .values[0][0], (.values[0][1]|tonumber)] | @csv' |sort -k 7 -g -t, |grep -v idle |tail -10
"apiserver-767bc658c8-xnhxf","oauth-apiserver","openshift-oauth-apiserver","dynamic","kepler",1705960752.598,167.538
"metrics-collector-deployment-6fb4467fb5-cdjgp","metrics-collector","open-cluster-management-addon-observability","dynamic","kepler",1705960752.598,185.478
"prometheus-k8s-0","prometheus-proxy","openshift-monitoring","dynamic","kepler",1705960752.598,186.642
"uwl-metrics-collector-deployment-5fc6d78687-cxghp","metrics-collector","open-cluster-management-addon-observability","dynamic","kepler",1705960752.598,254.625
"sriov-network-config-daemon-8nwd6","sriov-network-config-daemon","openshift-sriov-network-operator","dynamic","kepler",1705960752.598,259.461
"cluster-version-operator-59f77f8cbd-26jdh","cluster-version-operator","openshift-cluster-version","dynamic","kepler",1705960752.598,274.317
"kepler-exporter-7krg2","kepler-exporter","kepler","dynamic","kepler",1705960752.598,766.962
"kube-apiserver-gnb.flexran2.cars2.lab","kube-apiserver","openshift-kube-apiserver","dynamic","kepler",1705960752.598,6047.577
"flexran-binary-release-7fb6596b4d-cn292","flexran-l1app","flexranl1","dynamic","kepler",1705960752.598,235648.833
"kernel_processes","kernel_processes","kernel","dynamic","kepler",1705960752.598,407756.142

Output captured from Kepler in June 2023:

Ouput captured from Kepler in January 2024. My concern is the container metrics which appear at the bottom of this picture captured from my Grafana instance:

metrics being used here are: kepler_container_package_joules_total and kepler_container_dram_joules_total.

thanks in advance for your help.

What did you expect to happen?

Power usage to be roughly the same from Kepler usage (older version) back from June.

How can we reproduce it (as minimally and precisely as possible)?

Have been working with @rootfs on this, but I think conceivably one could deploy OpenShift using cgroups v1 and enabling the realtime kernel with the PerformanceProfile CR and see this behavior.

Anything else we need to know?

No response

Kepler image tag

quay.io/sustainable_computing_io/kepler:pr-1185

Kubernetes version

```console $ kubectl version Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.4", GitCommit:"33a7a8bcccdd1c7c0e2f51609d832d31232d2f26", GitTreeState:"clean", BuildDate:"2023-12-13T20:37:53Z", GoVersion:"go1.20.10 X:strictfipsruntime", Compiler:"gc", Platform:"linux/amd64"} Kustomize Version: v5.0.1 Server Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.9+e36e183", GitCommit:"04c94a756a2e2e0eab90f1fed560b464544044dc", GitTreeState:"clean", BuildDate:"2024-01-12T04:36:55Z", GoVersion:"go1.20.12 X:strictfipsruntime", Compiler:"gc", Platform:"linux/amd64"} ```

Cloud provider or bare metal

baremetal

OS version

```console # On Linux: $ cat /etc/os-release NAME="Red Hat Enterprise Linux CoreOS" ID="rhcos" ID_LIKE="rhel fedora" VERSION="414.92.202401121330-0" VERSION_ID="4.14" VARIANT="CoreOS" VARIANT_ID=coreos PLATFORM_ID="platform:el9" PRETTY_NAME="Red Hat Enterprise Linux CoreOS 414.92.202401121330-0 (Plow)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:redhat:enterprise_linux:9::coreos" HOME_URL="https://www.redhat.com/" DOCUMENTATION_URL="https://docs.openshift.com/container-platform/4.14/" BUG_REPORT_URL="https://bugzilla.redhat.com/" REDHAT_BUGZILLA_PRODUCT="OpenShift Container Platform" REDHAT_BUGZILLA_PRODUCT_VERSION="4.14" REDHAT_SUPPORT_PRODUCT="OpenShift Container Platform" REDHAT_SUPPORT_PRODUCT_VERSION="4.14" OPENSHIFT_VERSION="4.14" RHEL_VERSION="9.2" OSTREE_VERSION="414.92.202401121330-0" $ uname -a Linux gnb.flexran2.cars2.lab 5.14.0-284.48.1.rt14.333.el9_2.x86_64 #1 SMP PREEMPT_RT Thu Jan 4 13:55:53 EST 2024 x86_64 x86_64 x86_64 GNU/Linux ```

Install tools

Kepler deployment config

For on kubernetes: ```console $ KEPLER_NAMESPACE=kepler # provide kepler configmap $ kubectl get configmap kepler-cfm -n ${KEPLER_NAMESPACE} # paste output here # provide kepler deployment description $ kubectl describe deployment kepler-exporter -n ${KEPLER_NAMESPACE} ``` For standalone: # put your Kepler command argument here

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

marceloamaral commented 6 months ago

Hi @novacain1, just to understand better the issue:

Is Kepler running on top of Bare-metal or VMs?
The main issue is the idle power is lower for the flexran2 with the new code, right?
Is the felxrun2 and flexrunicx running in the same cluster or different clusters? (the energy-efficient and the non energy-efficient cluster)
This is a question for @rootfs : is the image using the code that was refactored?
@novacain1, Could you please get the prometheus metrics again but now with the rate() function in the query? localhost:9090/api/v1/query?query=rate(kepler_container_package_joules_total[30s])

novacain1 commented 6 months ago

Hi @marceloamaral thank you for your help. Answers to your questions:

Kepler is running on top of baremetal.
Right, I expect the power usage for flexran2 to be higher than flexranicx, as it's running on a node without power savings enabled.
These are two different clusters, on different servers. flexranicx is running as a Single Node OpenShift (SNO) cluster on one server, flexran2 is running SNO also on another. Both are identical makes / models with the same CPU, same memory, BIOS settings, and PCIe cards.

With that query, I get marginal difference in energy usage at 1m, nothing substantial. I suspect I am using a longer poll period from prometheus (OpenShift Observability Add-On), so 30s actually returns no data.

rootfs commented 6 months ago

This is a question for @rootfs : is the image using the code that was refactored?

yes, this image has fix in #1185

rootfs commented 6 months ago

Regarding the dashboard, the green one is the efficient node while the red one is the power hungry node. The node metrics looks right (green one is ~29% more efficient than red one) but the pod level metrics is not right: the green one uses more power than the red one.

rootfs commented 6 months ago

@marceloamaral this needs some thinking, the idle power formula uses the cpu time that we know has some problem

https://github.com/sustainable-computing-io/kepler/blob/main/pkg/collector/stats/node_stats.go#L64-L68

func (ne *NodeStats) UpdateIdleEnergyWithMinValue(isComponentsSystemCollectionSupported bool) {
    // gpu metric
    if config.EnabledGPU && gpu.IsGPUCollectionSupported() {
        ne.CalcIdleEnergy(config.AbsEnergyInGPU, config.IdleEnergyInGPU, config.GPUSMUtilization)
    }

    if isComponentsSystemCollectionSupported {
        ne.CalcIdleEnergy(config.AbsEnergyInCore, config.IdleEnergyInCore, config.CPUTime)
        ne.CalcIdleEnergy(config.AbsEnergyInDRAM, config.IdleEnergyInDRAM, config.CPUTime) // TODO: we should use another resource for DRAM
        ne.CalcIdleEnergy(config.AbsEnergyInUnCore, config.IdleEnergyInUnCore, config.CPUTime)
        ne.CalcIdleEnergy(config.AbsEnergyInPkg, config.IdleEnergyInPkg, config.CPUTime)
        ne.CalcIdleEnergy(config.AbsEnergyInPlatform, config.IdleEnergyInPlatform, config.CPUTime)
    }
}

rootfs commented 6 months ago

For reference, the image used in last June had this commit that disabled idle power

sustainable-computing-io / kepler