sustainable-computing-io / kepler

Kepler (Kubernetes-based Efficient Power Level Exporter) uses eBPF to probe performance counters and other system stats, use ML models to estimate workload energy consumption based on these stats, and exports them as Prometheus metrics
https://sustainable-computing.io
Apache License 2.0
1.19k stars 184 forks source link

Kepler not handling multiple cgroupIDs of a process. #1813

Open vimalk78 opened 1 month ago

vimalk78 commented 1 month ago

What happened?

A user space process can be associated with multiple cgroups. e.g. a VM started with qemu, can have different cgroup for each vcpu

sub processes of VM process

vimal vimal $ ps -T -p 12152
    PID    SPID TTY          TIME CMD
  12152   12152 ?        16:21:42 qemu-system-x86
  12152   12164 ?        00:00:00 qemu-system-x86
  12152   12167 ?        00:00:01 IO mon_iothread
  12152   12168 ?        03:48:57 CPU 0/KVM
  12152   12169 ?        04:00:23 CPU 1/KVM
  12152   12170 ?        04:31:59 CPU 2/KVM
  12152   12171 ?        03:54:04 CPU 3/KVM
  12152   12173 ?        00:00:00 vnc_worker
  12152 1488968 ?        00:00:00 worker
  12152 1489155 ?        00:00:00 worker
  12152 1489156 ?        00:00:00 worker
  12152 1489157 ?        00:00:00 worker
  12152 1489158 ?        00:00:00 worker
  12152 1489159 ?        00:00:00 worker
  12152 1489160 ?        00:00:00 worker
  12152 1489161 ?        00:00:00 worker
  12152 1489162 ?        00:00:00 worker
  12152 1489163 ?        00:00:00 worker
  12152 1489164 ?        00:00:00 worker
  12152 1489165 ?        00:00:00 worker
  12152 1489166 ?        00:00:00 worker
  12152 1489167 ?        00:00:00 worker
  12152 1489168 ?        00:00:00 worker
  12152 1489169 ?        00:00:00 worker
  12152 1489170 ?        00:00:00 worker
  12152 1489171 ?        00:00:00 worker

associated cgroups

vimal vimal $ for task in /proc/12152/task/* ; do echo "task: $task"; cat $task/cgroup || true; done
task: /proc/12152/task/12152
0::/machine.slice/machine-qemu\x2d2\x2dbeaker\x2dvm.scope/libvirt/emulator
task: /proc/12152/task/12164
0::/machine.slice/machine-qemu\x2d2\x2dbeaker\x2dvm.scope/libvirt/emulator
task: /proc/12152/task/12167
0::/machine.slice/machine-qemu\x2d2\x2dbeaker\x2dvm.scope/libvirt/emulator
task: /proc/12152/task/12168
0::/machine.slice/machine-qemu\x2d2\x2dbeaker\x2dvm.scope/libvirt/vcpu0
task: /proc/12152/task/12169
0::/machine.slice/machine-qemu\x2d2\x2dbeaker\x2dvm.scope/libvirt/vcpu1
task: /proc/12152/task/12170
0::/machine.slice/machine-qemu\x2d2\x2dbeaker\x2dvm.scope/libvirt/vcpu2
task: /proc/12152/task/12171
0::/machine.slice/machine-qemu\x2d2\x2dbeaker\x2dvm.scope/libvirt/vcpu3
task: /proc/12152/task/12173
0::/machine.slice/machine-qemu\x2d2\x2dbeaker\x2dvm.scope/libvirt/emulator
task: /proc/12152/task/1488968
0::/machine.slice/machine-qemu\x2d2\x2dbeaker\x2dvm.scope/libvirt/emulator

What did you expect to happen?

low priority bug

kepler only captures cgroup of the sub-process which creates entry in the ProcessStats. It may not be a problem, because most cases CgroupID is being compared to 1, to know if the process belongs to kernel.

If the function func NewProcessStats(pid, cGroupID uint64, containerID, vmID, command string, bpfSupportedMetrics bpf.SupportedMetrics) creates entry for a process as in user space, the pid, cGroupID, command etc should be of the parent pid (tgid), or should we replace the field cGroupID with isKernelProcess?

How can we reproduce it (as minimally and precisely as possible)?

run kepler with VM, or any process which can have multiple cgroups

Anything else we need to know?

No response

Kepler image tag

Kubernetes version

```console $ kubectl version # paste output here ```

Cloud provider or bare metal

OS version

```console # On Linux: $ cat /etc/os-release # paste output here $ uname -a # paste output here # On Windows: C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture # paste output here ```

Install tools

Kepler deployment config

For on kubernetes: ```console $ KEPLER_NAMESPACE=kepler # provide kepler configmap $ kubectl get configmap kepler-cfm -n ${KEPLER_NAMESPACE} # paste output here # provide kepler deployment description $ kubectl describe deployment kepler-exporter -n ${KEPLER_NAMESPACE} ``` For standalone: # put your Kepler command argument here

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

vimalk78 commented 1 month ago

Cc: @rootfs