Open davidchisnall opened 3 years ago
@kit-ty-kate, I took me minimal docker file from one that you'd filed in an issue on the moby port, so it's presumably worked for you in the past?
Yes it worked in the past (back in July)
I’m also getting this issue now. One thing that’s to be noted as well is that if you ^C docker build, then restart the job it will finish normally, even with --no-cache.
I haven't had the time to take a look at @gizahNL's moby port at all yet, so I'm not going to be a ton of help in debugging whether there's an issue at that layer or not. I do see the following log line which makes me think that moby might be receiving an exit but not processing it, though this is definitely not conclusive:
DEBU[2021-11-04T13:00:30.600271190Z] event module=libcontainerd namespace=moby topic=/tasks/exit
These lines from the containerd log (really from the shim) seem to indicate that the container exited:
time="2021-11-04T13:00:30.533109242Z" level=warning msg="entrypoint waiting!" pid=88058 runtime=wtf.sbk.runj.v1 state="&{1.0.2-runj-dev ac1d3703e32454d690b89bef70daafcb2d78d78e208dbf99091b91f0cf8f43b8 created 88058 /run/containerd/io.containerd.runtime.v2.task/moby/ac1d3703e32454d690b89bef70daafcb2d78d78e208dbf99091b91f0cf8f43b8 map[]}"
[...]
time="2021-11-04T13:00:30.570951582Z" level=warning msg="START runj" runtime=wtf.sbk.runj.v1 state="&{1.0.2-runj-dev ac1d3703e32454d690b89bef70daafcb2d78d78e208dbf99091b91f0cf8f43b8 created 88058 /run/containerd/io.containerd.runtime.v2.task/moby/ac1d3703e32454d690b89bef70daafcb2d78d78e208dbf99091b91f0cf8f43b8 map[]}"
[...]
time="2021-11-04T13:00:30.575246698Z" level=warning msg="PROCESSING EXIT!" pid=88058 runtime=wtf.sbk.runj.v1 status=0
time="2021-11-04T13:00:30.575310098Z" level=warning msg="INIT EXITED!" pid=88058 runtime=wtf.sbk.runj.v1
In these logs you can see pid 88058 was the main process for the container. This is initially the runj-entrypoint
program but after the Start API is called (which is show in the logs) it gets signalled (via a fifo) and calls unix.Exec
(which wraps execve(2)
) with the command specified in the bundle.
I did just make changes to how runj-entrypoint
works (see https://github.com/samuelkarp/runj/commit/398df9a0deea637af3062df1cf846a8d5d11bd49), so if it used to work before that change and now doesn't that'd be a pretty good indicator that my commit is buggy.
It looks as if it's tracking two processes in the jail.
The shim sets itself up as a subreaper with PROC_REAP_ACQUIRE
so it should be tracking every subprocess that is not otherwise reaped by a parent. (This is because the process hierarchy looks like containerd -> containerd-shim-runj-v1 -> runj (create) -> [main jail process] and runj exits after the runj create
is issued.) The only process it actually cares about is the runj-entrypoint
/main jail process, but the other output is still there since I find a lot of debugging output helps me when I'm developing.
Anyway, I think it'd help to try the following things:
ctr
commands to see containerd's view of the container during this. ctr -n moby container ls
, ctr -n moby container info ac1d3703e32454d690b89bef70daafcb2d78d78e208dbf99091b91f0cf8f43b8
(pulling that from the log above), ctr -n moby task ls
, etcsudo runj state ac1d3703e32454d690b89bef70daafcb2d78d78e208dbf99091b91f0cf8f43b8
Revert the commit I linked above and see if that fixes it. If so, we'll need to figure out what I screwed up (and the other steps will still be helpful).
Yup, that fixes it! I rolled back to a37cfff and docker build
worked.
Explore the ctr commands to see containerd's view of the container during this. ctr -n moby container ls, ctr -n moby container info ac1d3703e32454d690b89bef70daafcb2d78d78e208dbf99091b91f0cf8f43b8 (pulling that from the log above), ctr -n moby task ls, etc
It looks as if the stopped ones can't be cleaned up by containerd
and I couldn't find a graceful way of cleaning this up, so I uninstalled containerd and docker-engine, deleted everything that looked relevant (/var/lib/containerd and the zroot/docker filesystem) and reinstalled and restarted. So doing everything from scratch with the broken version:
$ docker build .
Sending build context to Docker daemon 2.048kB
Step 1/2 : FROM kwiat/freebsd:13.0-RELEASE
13.0-RELEASE: Pulling from kwiat/freebsd
6a487bee48e0: Pull complete
Digest: sha256:23d315d0f15df632d57c8cfec9dddde4451d695f585063a70a7a10fee3c10ebf
Status: Downloaded newer image for kwiat/freebsd:13.0-RELEASE
---> e122726568cf
Step 2/2 : RUN echo hello
---> Running in 29c6bc0cf571
hello
This hangs, as previously. From another terminal:
# ctr -n moby container ls
CONTAINER IMAGE RUNTIME
29c6bc0cf571c6587f9670916a08917aae21c58e728a1d45c1124bbb7abc204a - wtf.sbk.runj.v1
# ctr -n moby container info 29c6bc0cf571c6587f9670916a08917aae21c58e728a1d45c1124bbb7abc204a
{
"ID": "29c6bc0cf571c6587f9670916a08917aae21c58e728a1d45c1124bbb7abc204a",
"Labels": {
"com.docker/engine.bundle.path": "/var/run/docker/containerd/29c6bc0cf571c6587f9670916a08917aae21c58e728a1d45c1124bbb7abc204a"
},
"Image": "",
"Runtime": {
"Name": "wtf.sbk.runj.v1",
"Options": null
},
"SnapshotKey": "",
"Snapshotter": "",
"CreatedAt": "2021-11-05T11:28:00.547108221Z",
"UpdatedAt": "2021-11-05T11:28:00.547108221Z",
"Extensions": null,
"Spec": {
"ociVersion": "1.0.2-dev",
"process": {
"user": {
"uid": 0,
"gid": 0
},
"args": [
"/bin/sh",
"-c",
"echo hello"
],
"env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"HOSTNAME=29c6bc0cf571"
],
"cwd": "/",
"capabilities": {
"bounding": [
"CAP_CHOWN",
"CAP_DAC_OVERRIDE",
"CAP_FSETID",
"CAP_FOWNER",
"CAP_MKNOD",
"CAP_NET_RAW",
"CAP_SETGID",
"CAP_SETUID",
"CAP_SETFCAP",
"CAP_SETPCAP",
"CAP_NET_BIND_SERVICE",
"CAP_SYS_CHROOT",
"CAP_KILL",
"CAP_AUDIT_WRITE"
],
"effective": [
"CAP_CHOWN",
"CAP_DAC_OVERRIDE",
"CAP_FSETID",
"CAP_FOWNER",
"CAP_MKNOD",
"CAP_NET_RAW",
"CAP_SETGID",
"CAP_SETUID",
"CAP_SETFCAP",
"CAP_SETPCAP",
"CAP_NET_BIND_SERVICE",
"CAP_SYS_CHROOT",
"CAP_KILL",
"CAP_AUDIT_WRITE"
],
"inheritable": [
"CAP_CHOWN",
"CAP_DAC_OVERRIDE",
"CAP_FSETID",
"CAP_FOWNER",
"CAP_MKNOD",
"CAP_NET_RAW",
"CAP_SETGID",
"CAP_SETUID",
"CAP_SETFCAP",
"CAP_SETPCAP",
"CAP_NET_BIND_SERVICE",
"CAP_SYS_CHROOT",
"CAP_KILL",
"CAP_AUDIT_WRITE"
],
"permitted": [
"CAP_CHOWN",
"CAP_DAC_OVERRIDE",
"CAP_FSETID",
"CAP_FOWNER",
"CAP_MKNOD",
"CAP_NET_RAW",
"CAP_SETGID",
"CAP_SETUID",
"CAP_SETFCAP",
"CAP_SETPCAP",
"CAP_NET_BIND_SERVICE",
"CAP_SYS_CHROOT",
"CAP_KILL",
"CAP_AUDIT_WRITE"
]
},
"oomScoreAdj": 0
},
"root": {
"path": "/var/lib/docker/zfs/graph/9b78e7b43c91"
},
"hostname": "29c6bc0cf571",
"mounts": [
{
"destination": "/proc",
"type": "procfs",
"source": "proc",
"options": [
"nosuid",
"noexec"
]
},
{
"destination": "/dev",
"type": "devfs",
"source": "devfs"
}
],
"hooks": {
"createRuntime": [
{
"path": "/bin/cp",
"args": [
"/var/lib/docker/containers/29c6bc0cf571c6587f9670916a08917aae21c58e728a1d45c1124bbb7abc204a/resolv.conf",
"/var/lib/docker/zfs/graph/9b78e7b43c91/etc/resolv.conf"
]
},
{
"path": "/bin/cp",
"args": [
"/var/lib/docker/containers/29c6bc0cf571c6587f9670916a08917aae21c58e728a1d45c1124bbb7abc204a/hostname",
"/var/lib/docker/zfs/graph/9b78e7b43c91/etc/hostname"
]
},
{
"path": "/bin/cp",
"args": [
"/var/lib/docker/containers/29c6bc0cf571c6587f9670916a08917aae21c58e728a1d45c1124bbb7abc204a/hosts",
"/var/lib/docker/zfs/graph/9b78e7b43c91/etc/hosts"
]
}
]
},
"linux": {
"resources": {
"devices": [
{
"allow": false,
"access": "rwm"
},
{
"allow": true,
"type": "c",
"major": 1,
"minor": 5,
"access": "rwm"
},
{
"allow": true,
"type": "c",
"major": 1,
"minor": 3,
"access": "rwm"
},
{
"allow": true,
"type": "c",
"major": 1,
"minor": 9,
"access": "rwm"
},
{
"allow": true,
"type": "c",
"major": 1,
"minor": 8,
"access": "rwm"
},
{
"allow": true,
"type": "c",
"major": 5,
"minor": 0,
"access": "rwm"
},
{
"allow": true,
"type": "c",
"major": 5,
"minor": 1,
"access": "rwm"
},
{
"allow": false,
"type": "c",
"major": 10,
"minor": 229,
"access": "rwm"
}
],
"memory": {},
"cpu": {
"shares": 0
},
"blockIO": {
"weight": 0
}
},
"namespaces": [
{
"type": "mount"
},
{
"type": "network"
},
{
"type": "uts"
},
{
"type": "pid"
},
{
"type": "ipc"
}
],
"maskedPaths": [
"/proc/asound",
"/proc/acpi",
"/proc/kcore",
"/proc/keys",
"/proc/latency_stats",
"/proc/timer_list",
"/proc/timer_stats",
"/proc/sched_debug",
"/proc/scsi",
"/sys/firmware"
],
"readonlyPaths": [
"/proc/bus",
"/proc/fs",
"/proc/irq",
"/proc/sys",
"/proc/sysrq-trigger"
]
}
}
}
It looks as if the moby build is still assuming linux things and needs to generate a FreeBSD container config on FreeBSD.
# ctr -n moby task ls
TASK PID STATUS
29c6bc0cf571c6587f9670916a08917aae21c58e728a1d45c1124bbb7abc204a 0 UNKNOWN
This is suspicious - not sure why it thinks the pid is 0, pid 0 is the kernel. I presume 0 is a placeholder here for pid-not-known?
# runj state 29c6bc0cf571c6587f9670916a08917aae21c58e728a1d45c1124bbb7abc204a
Error: open /var/lib/runj/jails/29c6bc0cf571c6587f9670916a08917aae21c58e728a1d45c1124bbb7abc204a/state.json: no such file or directory
The /var/lib/runj/jails/
directory is empty.
# ps aux | grep runj
root 22779 0.0 0.0 267772 2524 1 S+ 11:34 0:00.00 grep runj
root 22733 0.0 0.0 983656 12796 5 I 11:28 0:00.16 /usr/local/bin/containerd-shim-runj-v1 -namespace moby -id 29c6bc0cf571c6587f9670916a08917aae21
So it looks as if the runj shim is still running but it has cleaned up all of the associated state?
I'm not 100% sure that the cause is the new code or if it's a race condition that is triggered more of the time in the new code. I had a container hang after trying to run ifconfig
in a Dockerfile
RUN
line to see if I could diagnose the problems with the moby network bits.
Restarting dockerd
causes it to loop doing this:
DEBU[2021-11-05T12:02:58.356143623Z] shutting down container considered alive by containerd container=549f35b8a14f7f5e3a11a8ce9ad22e633e4192c19004c8c879f195fcb98e6c5b paused=false restarting=false running=true
DEBU[2021-11-05T12:02:58.356166123Z] Sending kill signal 15 to container 549f35b8a14f7f5e3a11a8ce9ad22e633e4192c19004c8c879f195fcb98e6c5b
INFO[2021-11-05T12:03:08.402961299Z] Container failed to exit within 10 seconds of signal 15 - using the force container=7d3653e7ba07bc8694750d7d622f8d3fcc9ff602e9b50b3e57dee62f05d3b80d
DEBU[2021-11-05T12:03:08.403085800Z] Sending kill signal 9 to container 7d3653e7ba07bc8694750d7d622f8d3fcc9ff602e9b50b3e57dee62f05d3b80d
INFO[2021-11-05T12:03:08.403193100Z] Container failed to exit within 10 seconds of signal 15 - using the force container=549f35b8a14f7f5e3a11a8ce9ad22e633e4192c19004c8c879f195fcb98e6c5b
DEBU[2021-11-05T12:03:08.403249901Z] Sending kill signal 9 to container 549f35b8a14f7f5e3a11a8ce9ad22e633e4192c19004c8c879f195fcb98e6c5b
dockerd
is right that containerd
thinks that the container is running, and it does have a shim instance for each version:
# ctr -n moby containers ls
CONTAINER IMAGE RUNTIME
549f35b8a14f7f5e3a11a8ce9ad22e633e4192c19004c8c879f195fcb98e6c5b - wtf.sbk.runj.v1
7d3653e7ba07bc8694750d7d622f8d3fcc9ff602e9b50b3e57dee62f05d3b80d - wtf.sbk.runj.v1
# ps aux | grep runj
root 52084 0.0 0.0 266748 2436 1 R+ 12:05 0:00.00 grep runj
root 50483 0.0 0.0 983400 13000 5 I 11:56 0:00.23 /usr/local/bin/containerd-shim-runj-v1 -namespace moby -id 7d3653e7ba07bc8694750d7d622f8d3fcc9f
root 51150 0.0 0.0 984808 12560 5 I 11:59 0:00.22 /usr/local/bin/containerd-shim-runj-v1 -namespace moby -id 549f35b8a14f7f5e3a11a8ce9ad22e633e41
But in this situation the same thing has happened: /var/lib/runj/jails
is empty.
Yeah moby writes out a default config containing many linuxisms, they're all ignored by runj and containerd though.
This isn't a problem with runj itself. In my work-in-progress port of buildah, I tried this example build:
FROM kwiat/freebsd:13.0-RELEASE
RUN echo Hello
RUN exit 42
and got the expected results:
STEP 1/3: FROM kwiat/freebsd:13.0-RELEASE
STEP 2/3: RUN echo Hello
Hello
STEP 3/3: RUN exit 42
error building at STEP "RUN exit 42": error while running runtime: exit status 42
I'm not sure if this is a problem with runj, the containerd shim, or with the moby port (from https://github.com/gizahNL/moby), but If I write this minimal Dockerfile:
Running docker build prints hello but doesn't detect that the command has completed:
It hangs at this point and does not advance.
docker --debug
produces this output:At the same time,
containerd -l debug
produces this:It looks as if it's tracking two processes in the jail. I'm not sure why the second exits first (maybe they exit at about the same time and are reported in different orders?). At the end of this, I would expect
runj
to report that the command finished.If I restart
dockerd
, it notices that it has a container that appears to be running that it doesn't know about and tries to kill it:This triggers the following output from
containerd
:Again, it looks as if
runj
is generating a log message indicating that the process has exited, but isn't reporting the exit tocontainerd
?I can kill the jails with
jail -r
.@kit-ty-kate, I took me minimal docker file from one that you'd filed in an issue on the moby port, so it's presumably worked for you in the past?