opencontainers / runc

CLI tool for spawning and running containers according to the OCI specification
https://www.opencontainers.org/
Apache License 2.0
11.89k stars 2.11k forks source link

kill command fails with `read cgroup.procs: operation not supported` #3821

Closed amurzeau closed 5 months ago

amurzeau commented 1 year ago

Description

Hi,

While testing buildkit within a docker container, tests use runc. When tring to kill a runc container, runc error out with and error like this: read /sys/fs/cgroup/buildkit/mxv4shz9kwdm0p5u49mw971ft/cgroup.procs: operation not supported and then return error code 1. The command line is this one: runc --root /run/containerd/runc/buildkit --log /tmp/bktest_containerd1141985211/state/io.containerd.runtime.v2.task/buildkit/mxv4shz9kwdm0p5u49mw971ft/log.json --log-format json kill --all mxv4shz9kwdm0p5u49mw971ft 9

Steps to reproduce the issue

  1. Run buildkit tests on a Debian Unstable with docker rootful from docker.io package running the dev-env target from the Dockerfile at the root of buildkit git repository.

Describe the results you received and expected

Several tests using containerd fail with this error:

time="2023-04-08T18:18:07Z" level=error msg="/moby.buildkit.v1.Control/Solve returned error: rpc error: code = Unknown desc = process \"sh -c cat /dev/urandom | head -c 100 | sha256sum > /randomfile\" did not complete successfully: failed to delete task rkpae5w5k41u61w4vqnuoh6gy: unknown error after kill: runc did not terminate successfully: exit status 1: read /sys/fs/cgroup/buildkit/rkpae5w5k41u61w4vqnuoh6gy/cgroup.procs: operation not supported\n: unknown"
process "sh -c cat /dev/urandom | head -c 100 | sha256sum > /randomfile" did not complete successfully: failed to delete task rkpae5w5k41u61w4vqnuoh6gy: unknown error after kill: runc did not terminate successfully: exit status 1: read /sys/fs/cgroup/buildkit/rkpae5w5k41u61w4vqnuoh6gy/cgroup.procs: operation not supported
: unknown

What version of runc are you using?

runc version v1.1.5 spec: 1.0.2-dev go: go1.20.3 libseccomp: 2.5.4

Host OS information

Host:

PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

container running dev-env target from Dockerfile from buildkit git repository:

NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.17.3
PRETTY_NAME="Alpine Linux v3.17"
HOME_URL="https://alpinelinux.org/"
BUG_REPORT_URL="https://gitlab.alpinelinux.org/alpine/aports/-/issues"

Host kernel information

Linux DOC-PC3 6.1.0-7-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.20-1 (2023-03-19) x86_64 GNU/Linux

amurzeau commented 1 year ago

I found that the issue is that the cgroup is in threaded mode, and in that case, reading cgroup.procs returns ENOTSUP.

By patching runc with the following patch, tests work again and runc doesn't fail:

diff --git a/libcontainer/cgroups/utils.go b/libcontainer/cgroups/utils.go
index b32af4ee..70080efd 100644
--- a/libcontainer/cgroups/utils.go
+++ b/libcontainer/cgroups/utils.go
@@ -19,6 +19,7 @@ import (

 const (
        CgroupProcesses   = "cgroup.procs"
+       CgroupThreads     = "cgroup.threads"
        unifiedMountpoint = "/sys/fs/cgroup"
        hybridMountpoint  = "/sys/fs/cgroup/unified"
 )
@@ -137,14 +138,16 @@ func GetAllSubsystems() ([]string, error) {
 }

 func readProcsFile(dir string) ([]int, error) {
-       f, err := OpenFile(dir, CgroupProcesses, os.O_RDONLY)
+       contents, err := ReadFile(dir, CgroupProcesses)
+       if errors.Is(err, unix.ENOTSUP) {
+               contents, err = ReadFile(dir, CgroupThreads)
+       }
        if err != nil {
                return nil, err
        }
-       defer f.Close()

        var (
-               s   = bufio.NewScanner(f)
+               s   = bufio.NewScanner(strings.NewReader(contents))
                out = []int{}
        )

Here is the type of the cgroups (these commands were run inside the buildkit's dev-env container:

# cat /sys/fs/cgroup/buildkit/mxv4shz9kwdm0p5u49mw971ft/cgroup.type
threaded
# cat /sys/fs/cgroup/buildkit/cgroup.type
threaded
# cat /sys/fs/cgroup/cgroup.type
domain threaded
Bacto commented 10 months ago

Hi, I have the same issue (with runc 1.1.10). Having this patch applied to the next version would be awesome!

kolyshkin commented 10 months ago

@Bacto we've changed this part of runc a lot in the main branch. Can you try to repro this using runc compiled from the main branch?

Bacto commented 10 months ago

Hi @kolyshkin,

I tried with the main branch and got the same issue:

# runc -v
runc version 1.1.0+dev
commit: 0c5a735
spec: 1.1.0+dev
go: go1.21.6
libseccomp: 2.5.5
kolyshkin commented 5 months ago

Here is the type of the cgroups (these commands were run inside the buildkit's dev-env container:

# cat /sys/fs/cgroup/buildkit/mxv4shz9kwdm0p5u49mw971ft/cgroup.type
threaded
# cat /sys/fs/cgroup/buildkit/cgroup.type
threaded
# cat /sys/fs/cgroup/cgroup.type
domain threaded

So the problem here is threaded cgroup type. In this case, processes actually belong to the cgroup parent which has "domain threaded" type (i.e. top cgroup in this case). It would be incorrect to send SIGKILL to specific threads in this group. So, basically, runc kill does the right thing here returning an error.

This is some kind of a misconfiguration, possibly caused by buildkit.

kolyshkin commented 5 months ago

Created Debian 12 VM, checked in buildkit and ran its test suite inside a container (make test). Was not able to reproduce.

I think there was something wrong originally when starting a container.

Would still like to get to the bottom of it, so any suggestions of how to reproduce it (ideally a vagrant file or something like this) are welcome.

amurzeau commented 5 months ago

The issue is fixed in main branch. I've tried again the 1.1.5 version and reproduced it, but I don't reproduce it with the main branch of runc.

I've tried to find the first fixed version and found that I can reproduce the same issue with 1.1.12 but not anymore with 1.2.0-rc.1.

So I'm closing this issue.

For reference, I'm using go test -v -run ^TestIntegration/TestDiffSingleLayer.*$ github.com/moby/buildkit/client -count=1 to run affected tests in buildkit with the tested runc in /usr/bin/runc.

cathaysia commented 4 months ago

After upgrade runc, docker still report this error:

# docker info
 Runtimes: runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: ae07eda36dd25f8a1b98dfbf587313b99c0190bb
 runc version: v1.2.0-rc.1-0-g275e6d85

error:

java.io.IOException: Failed to run top 'a77dc1be093aebb8a8f18fd634adc2ebbf2d798a7c7e8e7aa5283770b1efd9b6'. Error: Error response from daemon: runc did not terminate successfully: exit status 1: unable to get all container pids: read /sys/fs/cgroup/docker/a77dc1be093aebb8a8f18fd634adc2ebbf2d798a7c7e8e7aa5283770b1efd9b6/cgroup.procs: operation not supported
# cat /sys/fs/cgroup/docker/a77dc1be093aebb8a8f18fd634adc2ebbf2d798a7c7e8e7aa5283770b1efd9b6/cgroup.type
threaded
cathaysia commented 4 months ago

I tried 8256a9384fa8c44aa30b3ed948e7c3e34b19b89a, which fix this problem.

kolyshkin commented 4 months ago

I tried 8256a93, which fix this problem.

I guess you quoted a wrong commit.

kolyshkin commented 4 months ago

@amurzeau could you do git-bisect to find which runc commit fixes it?

amurzeau commented 4 months ago

The first commit without the issue is f8ad20f500bf75edd86041657ee762bce116f8f5. The previous one 9583b3d1c297021109081872c52302316ede15b1, still cause the same failure.

The cause is that the failure occurs with this stacktrace:

runtime/debug.Stack()
        /usr/local/go/src/runtime/debug/stack.go:24 +0x65
github.com/opencontainers/runc/libcontainer/cgroups.readProcsFile({0xc0000f2d40?, 0xc0000f30c0?})
        /tmp/runc/libcontainer/cgroups/utils.go:166 +0x372
github.com/opencontainers/runc/libcontainer/cgroups.GetAllPids.func1({0xc0000f2d40, 0x31}, {0x399dc0?, 0xc0001695b0?}, {0x0?, 0x0?})
        /tmp/runc/libcontainer/cgroups/getallpids.go:19 +0x79
path/filepath.walkDir({0xc0000f2d40, 0x31}, {0x399dc0, 0xc0001695b0}, 0xc000135180)
        /usr/local/go/src/path/filepath/path.go:445 +0x5c
path/filepath.WalkDir({0xc0000f2d40, 0x31}, 0xc000135180)
        /usr/local/go/src/path/filepath/path.go:535 +0xb0
github.com/opencontainers/runc/libcontainer/cgroups.GetAllPids({0xc0000f2d40?, 0x6?})
        /tmp/runc/libcontainer/cgroups/getallpids.go:12 +0x4e
github.com/opencontainers/runc/libcontainer/cgroups/fs2.(*Manager).GetAllPids(0xc0000feb60?)
        /tmp/runc/libcontainer/cgroups/fs2/fs2.go:92 +0x25
github.com/opencontainers/runc/libcontainer.signalAllProcesses({0x39d1c0, 0xc0000feb60}, 0x0?)
        /tmp/runc/libcontainer/init_linux.go:583 +0xad
github.com/opencontainers/runc/libcontainer.(*Container).Signal(0xc0000bb220, {0x398770?, 0xaaa8a8}, 0x1)
        /tmp/runc/libcontainer/container_linux.go:383 +0x265
main.glob..func7(0xc0000c8580)
        /tmp/runc/kill.go:52 +0x113
github.com/urfave/cli.HandleAction({0x2467a0?, 0x323b50?}, 0x4?)
        /tmp/runc/vendor/github.com/urfave/cli/app.go:524 +0x50
github.com/urfave/cli.Command.Run({{0x2dfe61, 0x4}, {0x0, 0x0}, {0x0, 0x0, 0x0}, {0x307052, 0x52}, {0x0, ...}, ...}, ...)
        /tmp/runc/vendor/github.com/urfave/cli/command.go:175 +0x67b
github.com/urfave/cli.(*App).Run(0xc0000ea380, {0xc0000b4000, 0xb, 0xb})
        /tmp/runc/vendor/github.com/urfave/cli/app.go:277 +0xb87
main.main()
        /tmp/runc/main.go:165 +0x1208

The commit that fixes the issue (f8ad20f500bf75edd86041657ee762bce116f8f5) removes the call to c.ignoreCgroupError(signalAllProcesses(c.cgroupManager, sig)) which was part of the stacktrace.

I think this can be reproduced with this bundle: runctest_no_pid_namespace.tar.gz

To test: cd runctest && ./test.sh The bundle's rootfs just contain a busybox binary at usr/bin/sh and usr/bin/sleep with linker dependency if needed (/lib/ld-whatever.so).

Running runc kill --all yield the error at commit 9583b3d1c297021109081872c52302316ede15b1:

ERRO[0000] read /sys/fs/cgroup/buildkit/runctest/cgroup.procs: operation not supported

buildkit / containerd use a pid namespace, so after f8ad20f500bf75edd86041657ee762bce116f8f5, signalAllProcesses is not called anymore. But without a pid namespace (as in my runctest test), it is still called: https://github.com/opencontainers/runc/blob/f8ad20f500bf75edd86041657ee762bce116f8f5/libcontainer/container_linux.go#L386-L388

And thus still trigger the error (so I'm not sure the commit really fix the issue):

ERRO[0000] unable to signal init: read /sys/fs/cgroup/buildkit/runctest/cgroup.procs: operation not supported

Note: I'm running this test in a docker rootful container.