Open borgerli opened 3 years ago
LXCFS virtualizes cpu utilization according to the cgroup the target process is in. If it's not using a lot of cpu then you won't see anything. Try to create some load by e.g. calling stress with the cpu option inside of the container and you should see an increase.
@brauner Thanks for comment.
Acutally, we did run a process which uses lots of cpu(while true; do echo test > /dev/null;done &
). And as you suggested, I tested with stress
, and got the same result: top
showed ~100 %cpu, but ps
still 0.0 %cpu.
/usr/local/bin/lxcfs -l --enable-cfs --enable-pidfd /var/lib/lxc/lxcfs
docker run -it --name stress -m 128m --cpus=1 --rm \
-v /var/lib/lxc/lxcfs/proc/cpuinfo:/proc/cpuinfo:rw \
-v /var/lib/lxc/lxcfs/proc/diskstats:/proc/diskstats:rw \
-v /var/lib/lxc/lxcfs/proc/meminfo:/proc/meminfo:rw \
-v /var/lib/lxc/lxcfs/proc/stat:/proc/stat:rw \
-v /var/lib/lxc/lxcfs/proc/swaps:/proc/swaps:rw \
-v /var/lib/lxc/lxcfs/proc/loadavg:/proc/loadavg:rw \
-v /var/lib/lxc/lxcfs/proc/uptime:/proc/uptime:rw \
-v /var/lib/lxc/lxcfs/sys/devices/system/cpu/online:/sys/devices/system/cpu/online:rw \
progrium/stress --cpu 1
get into the container and verity %cpu with top(99.6) and ps(0.0)
root@borgerli-devcloud:~# docker exec -it $(docker inspect stress -f "{{.Id}}") /bin/bash
root@33bc005fa2d5:/# top -b -n 1
top - 02:34:42 up 4 min, 0 users, load average: 0.28, 0.07, 0.02
Tasks: 4 total, 2 running, 2 sleeping, 0 stopped, 0 zombie
%Cpu(s): 98.5 us, 0.0 sy, 0.0 ni, 1.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 131072 total, 3660 used, 127412 free, 0 buffers
KiB Swap: 0 total, 0 used, 0 free. 4 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7 root 20 0 7316 96 0 R 99.6 0.1 4:03.82 stress
1 root 20 0 7316 896 812 S 0.0 0.7 0:00.02 stress
28 root 20 0 18164 3300 2828 S 0.0 2.5 0:00.02 bash
36 root 20 0 19748 2372 2124 R 0.0 1.8 0:00.00 top
root@33bc005fa2d5:/# ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.6 7316 896 pts/0 Ss+ 02:30 0:00 /usr/bin/stress --verbose --
root 7 0.0 0.0 7316 96 pts/0 R+ 02:30 4:10 /usr/bin/stress --verbose --
root 28 0.0 2.5 18164 3300 pts/1 Ss 02:34 0:00 /bin/bash
root 37 0.0 1.5 15576 2064 pts/1 R+ 02:34 0:00 ps aux
![screenshot](https://raw.githubusercontent.com/borgerli/lxcfs-admission-webhook/master/lxc_ps_top.png)
Odd, what happens, if you turn off cpu shares, i.e. skip --enable-cfs
?
@brauner I checked procps code related to pcpu, and found the reason of this issue.
As shown in below code of procps, if lxcfs uptime mounted in containers, the seconds_since_boot
(since the container starts) will be always little than the process start_time
(since the boot time of the host). And as a result, seconds
will always be zero, and then pcpu
is also zero.
https://gitlab.com/procps-ng/procps/-/blob/master/ps/output.c#L525:
seconds = cook_etime(pp);
if(seconds) pcpu = (total_time * 1000ULL / Hertz) / seconds;
https://gitlab.com/procps-ng/procps/-/blob/master/ps/output.c#L136:
#define cook_etime(P) (((unsigned long long)seconds_since_boot >= (P->start_time / Hertz)) ? ((unsigned long long)seconds_since_boot - (P->start_time / Hertz)) : 0)
A workaround is not to mount lxcfs proc/uptime for containers. But this will make containers lose uptime virtualization.
Is it possible for lxcfs to just return host uptime when the calling progress comm
is `ps' ?
@brauner I submitted a PR for this issue, please review. Thank you.
PR #445
@brauner Could you please help review the PR?
Hi @borgerli
Sorry for a long delay with response from us. We are working on sorting out issues here and there right now.
I have read through your PR and understood the idea. But the question is that if we can, instead of adding hacks to LXCFS, fix procps utils not to use the uptime value to calculate CPU load and adjust algorithm to be similar to what we have in top
utility?
cc @stgraber
Yeah, returning different output based on command name definitely isn't something I'd want is to do. It's way too hacky and will let to an undebugable mess.
Tweaking userspace to be a bit smarter would definitely be easier on this case. Especially as there's no way for us to visualize those per-process files.
Once we get @mihalicyn 's work to have lxcfs features per container, then you'd also get the ability to turn off the uptime virtualization where it remains problematic.
I'm using lxcfs 4.0.7. I created an container whit lxcfs proc files mount, then kicked off a process
while true;do echo test > /dev/null;done
. In container,top
command showed the correct %cpu information, whileps
always showed0.0
. However when not using lxcfs,ps
worked well.Steps
start lxcfs
`/usr/local/bin/lxcfs -l --enable-cfs --enable-pidfd /var/lib/lxc/lxcfs
start docker container
test
top
shows100.0
, whileps
shows0.0
for process 16test without lxcfs
top
shows 100.0, andps
shows102