opencontainers / runc

CLI tool for spawning and running containers according to the OCI specification
https://www.opencontainers.org/
Apache License 2.0
11.74k stars 2.09k forks source link

[Alpine] docker top, runc ps fail with cgroup2 with: unable to get all container pids #4097

Open kholmanskikh opened 11 months ago

kholmanskikh commented 11 months ago

Description

docker top and runc ps fail with:

alpine:~$ docker top 09e847645eec
Error response from daemon: runc did not terminate successfully: exit status 1: unable to get all container pids: read /sys/fs/cgroup/docker/09e847645eec8091d041c27b5ff969825b10155b60ca00230043c87764884135/cgroup.procs: operation not supported
: unknown

~ # runc --root /run/docker/runtime-runc/moby ps 09e847645eec8091d041c27b5ff969825b10155b60ca00230043c87764884135
ERRO[0000] unable to get all container pids: read /sys/fs/cgroup/docker/09e847645eec8091d041c27b5ff969825b10155b60ca00230043c87764884135/cgroup.procs: operation not supported 
~ # 

when the system has cgroup2 mounted as:

alpine:~$ mount|grep cgroup
none on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
alpine:~$ 

and this does not happen when cgroup v1 is mounted (in addition to, or instead of cgroup v2).

The issue was found on Alpine Edge with packages:

alpine:~$ apk list -I|grep -E 'runc|docker|containerd'|sort
containerd-1.7.7-r2 x86_64 {containerd} (Apache-2.0) [installed]
containerd-openrc-1.7.7-r2 x86_64 {containerd} (Apache-2.0) [installed]
docker-24.0.6-r4 x86_64 {docker} (Apache-2.0) [installed]
docker-cli-24.0.6-r4 x86_64 {docker} (Apache-2.0) [installed]
docker-cli-buildx-0.11.2-r3 x86_64 {docker-cli-buildx} (Apache-2.0) [installed]
docker-engine-24.0.6-r4 x86_64 {docker} (Apache-2.0) [installed]
docker-openrc-24.0.6-r4 x86_64 {docker} (Apache-2.0) [installed]
runc-1.1.9-r2 x86_64 {runc} (Apache-2.0) [installed]
alpine:~$ 

Alpine uses openrc, which allows to specify the cgroup mount strategy in /etc/rc.conf:

# This sets the mode used to mount cgroups.
# "hybrid" mounts cgroups version 2 on /sys/fs/cgroup/unified and
# cgroups version 1 on /sys/fs/cgroup.
# "legacy" mounts cgroups version 1 on /sys/fs/cgroup
# "unified" mounts cgroups version 2 on /sys/fs/cgroup
#rc_cgroup_mode="unified"

and the issue mentioned above is observed when rc_cgroup_mode is unified:

alpine:~$ mount|grep cgroup
none on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
alpine:~$ 

and is not observed when it's legacy:

alpine:~$ mount|grep cgroup
cgroup_root on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,relatime,size=10240k,mode=755,inode64)
openrc on /sys/fs/cgroup/openrc type cgroup (rw,nosuid,nodev,noexec,relatime,release_agent=/lib/rc/sh/cgroup-release-agent.sh,name=openrc)
cpuset on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cpu on /sys/fs/cgroup/cpu type cgroup (rw,nosuid,nodev,noexec,relatime,cpu)
cpuacct on /sys/fs/cgroup/cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct)
blkio on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
memory on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
devices on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
freezer on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
net_cls on /sys/fs/cgroup/net_cls type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls)
perf_event on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
net_prio on /sys/fs/cgroup/net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_prio)
hugetlb on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
pids on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
alpine:~$ 

or hybrid:

alpine:~$ mount|grep cgroup
cgroup_root on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,relatime,size=10240k,mode=755,inode64)
openrc on /sys/fs/cgroup/openrc type cgroup (rw,nosuid,nodev,noexec,relatime,release_agent=/lib/rc/sh/cgroup-release-agent.sh,name=openrc)
none on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
cpuset on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cpu on /sys/fs/cgroup/cpu type cgroup (rw,nosuid,nodev,noexec,relatime,cpu)
cpuacct on /sys/fs/cgroup/cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct)
blkio on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
memory on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
devices on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
freezer on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
net_cls on /sys/fs/cgroup/net_cls type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls)
perf_event on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
net_prio on /sys/fs/cgroup/net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_prio)
hugetlb on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
pids on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
alpine:~$ 

Steps to reproduce the issue

  1. Start any container with docker run -it --rm <any container>
  2. execute docker top <container id> or runc --root /run/docker/runtime-runc/moby ps <container id>

Describe the results you received and expected

The command should display a list of processes in the container.

What version of runc are you using?

runc version 1.1.9 commit: 82f18fe0e44a59034f3e1f45e475fa5636e539aa spec: 1.0.2-dev go: go1.21.3 libseccomp: 2.5.4

Host OS information

NAME="Alpine Linux" ID=alpine VERSION_ID=3.19_alpha20230901 PRETTY_NAME="Alpine Linux edge" HOME_URL="https://alpinelinux.org/" BUG_REPORT_URL="https://gitlab.alpinelinux.org/alpine/aports/-/issues"

Host kernel information

Linux alpine 6.1.59-0-lts #1-Alpine SMP PREEMPT_DYNAMIC Fri, 20 Oct 2023 06:24:46 +0000 x86_64 Linux

kholmanskikh commented 11 months ago

The issue is reproducible with runc taken from the main git branch.

kolyshkin commented 10 months ago

@kholmanskikh can you please check and confirm/deny that this is because of nsdelegate option to cgroupv2 mount?

kholmanskikh commented 9 months ago

The issue is also reproducible when the cgroup2 is mounted without the nsdelegate option:

alpine:~$ mount|grep cgroup
none on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime)
alpine:~$ docker run --rm -it -d alpine
2babd8f8f743beea96d6f2fba02de19036e0f734d8c1d249ac694b8ad501f0e6
alpine:~$ docker top 2babd8f8f743beea96d6f2fba02de19036e0f734d8c1d249ac694b8ad501f0e6
Error response from daemon: runc did not terminate successfully: exit status 1: unable to get all container pids: read /sys/fs/cgroup/docker/2babd8f8f743beea96d6f2fba02de19036e0f734d8c1d249ac694b8ad501f0e6/cgroup.procs: operation not supported
: unknown
alpine:~$ 
ncopa commented 9 months ago

related downstream issues:

It also fails to start containers with --memory option:

$ docker run --rm -it --memory 2G alpine
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: unable to apply cgroup configuration: cannot enter cgroupv2 "/sys/fs/cgroup/docker" with domain controllers -- it is in domain threaded mode: unknown.

In this case I have a daemon.json:

{
        "storage-driver": "overlay2",
        "cgroup-parent": "/docker"
}

EDIT: but if I use:

{
  "cgroup-parent": "/dockerContainers"
}

It actually works.

ncopa commented 9 months ago

Could it be that runc sets docker/cgroup.type to domain threaded?

If I restart the docker daemon, it will initially be domain, but after first run container it changes to domain threaded:

ncopa-desktop:~$ doas /etc/init.d/docker start
 * Starting Docker Daemon ...                                                                 [ ok ]
ncopa-desktop:~$ cat /sys/fs/cgroup/docker/cgroup.type 
domain
ncopa-desktop:~$ docker run --rm alpine echo hello
hello
ncopa-desktop:~$ cat /sys/fs/cgroup/docker/cgroup.type 
domain threaded

Why does it end up with setting cgroup type as domain threaded?

ncopa commented 9 months ago

I found out that docker itself does not create /sys/fs/cgroup/docker. It is openrc that creates this.

It seems that also docker's default cgroup-parent also is docker. I think what happens here is that docker and openrc are stepping on each others toes.

tbayart commented 8 months ago

Hi, i have the same issue under Portainer. I installer Alpine linux x64 and when i want to look at container stats in Portainer, i get the following error

"runc did not terminate successfully: exit status 1: unable to get all container pids: read /sys/fs/cgroup/docker/c7fe07c5253dba763ce8fde71945c3a5ac32998ae50dc1345dba7cffd6fab5fa/cgroup.procs: operation not supported: unknown"

I have many containers running fine for a while now but i'm unable to get stats