Closed rootfs closed 11 months ago
Is it the problem of BPF probe?
Could you also share beginning of the log?
Also, try grep active processes
in the log (need log level 3).
I also suspect security requirement for OCP.
thanks @sunya-ch here is the log https://pastebin.com/GGuxjutk
perf stat
can also read cpu counters from running processes
# perf stat -e cycles,cache-misses,instructions -p 47302
^C
Performance counter stats for process id '47302':
1,697,891,077 cycles
10,431,678 cache-misses
1,555,348,029 instructions # 0.92 insn per cycle
2.122812892 seconds time elapsed
running perf
inside of the kepler pod didn't work
[root@kepler-exporter-ngh4b /]# perf stat -e cycles,instructions,cache-misses -p 47302 -a
PID/TID switch overriding SYSTEM
WARNING: Ignored open failure for pid 47302
WARNING: Ignored open failure for pid 47661
WARNING: Ignored open failure for pid 47662
WARNING: Ignored open failure for pid 47667
WARNING: Ignored open failure for pid 47760
WARNING: Ignored open failure for pid 47761
Error:
The sys_perf_event_open() syscall returned with 3 (No such process) for event (cycles).
/bin/dmesg | grep -i perf may provide additional information.
after @novacain1 turning off rt kernel, the bpf metrics are back again. So this is specific to rt kernels.
One way on OpenShift you could try @rootfs is to use a debug shell (with cluster-admin, which you have with the kubeconfig), but specify a different image where it is relatively easy to install packages:
oc debug node/hostname.openshift.lab --image=quay.io/fedora/fedora:38
Temporary namespace openshift-debug-qv5wq is created for debugging node...
Starting pod/hostnameopenshiftlab-debug ...
To use host binaries, run `chroot /host`
Pod IP: 192.168.38.140
If you don't see a command prompt, try pressing enter.
sh-5.2# dnf install perf bpftool bpftrace
perf looks to launch here, for me at least.
@rootfs, can you execute any eBPF program on the host when the RT kernel is enabled?
I've saw that you can run perf
on the host, but it doesn't work within the Kepler container. It's possible that the RT kernel requires additional configurations to be exposed within the Kepler containers.
as @rootfs pointed before, https://lwn.net/Articles/802884/, ebpf on RT kernel seems to be enabled for kernel >= 5.3.
dup of #973, closing this one
What happened?
I am running kepler on an OCP 4.12 setup that runs real time kernel from @novacain1.
For some reason, both ebpf cpu time and perf counter metrics are zeroes:
But
perf stat
does show some results:As a workaround, I use
kubelet_cpu_usage
asCORE_USAGE_METRIC
in kepler configmap, i.e. addingWhat did you expect to happen?
bpf stats should be non zeroes
How can we reproduce it (as minimally and precisely as possible)?
discovered on a bm OCP 4.12 setup that runs real time kernel
Anything else we need to know?
No response
Kepler image tag
Kubernetes version
Cloud provider or bare metal
OS version
Install tools
Kepler deployment config
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)