rootless-containers / rootlesskit

Linux-native "fake root" for implementing rootless containers
Apache License 2.0
988 stars 98 forks source link

Stopping and starting the container daemon when containers are running fails #355

Open nachevn opened 1 year ago

nachevn commented 1 year ago

To be honest I'm even not sure whether this issue is directly related to the rootlesskit or maybe containerd or docker itself but if it isn't please let me know and I will move it to away. I create the issue here, because the /usr/share/docker.io/contrib/dockerd-rootless.sh starts at the end a rootlesskit process which initialize the docker daemon and containerd processes.

    export _DOCKERD_ROOTLESS_CHILD
    # Re-exec the script via RootlessKit, so as to create unprivileged {user,mount,network} namespaces.
    #
    # --copy-up allows removing/creating files in the directories by creating tmpfs and symlinks
    # * /etc: copy-up is required so as to prevent `/etc/resolv.conf` in the
    #         namespace from being unexpectedly unmounted when `/etc/resolv.conf` is recreated on the host
    #         (by either systemd-networkd or NetworkManager)
    # * /run: copy-up is required so that we can create /run/docker (hardcoded for plugins) in our namespace
    exec $rootlesskit \
        --net=$net --mtu=$mtu \
        --slirp4netns-sandbox=$DOCKERD_ROOTLESS_ROOTLESSKIT_SLIRP4NETNS_SANDBOX \
        --slirp4netns-seccomp=$DOCKERD_ROOTLESS_ROOTLESSKIT_SLIRP4NETNS_SECCOMP \
        --disable-host-loopback --port-driver=$DOCKERD_ROOTLESS_ROOTLESSKIT_PORT_DRIVER \
        --copy-up=/etc --copy-up=/run \
        --propagation=rslave \
        $DOCKERD_ROOTLESS_ROOTLESSKIT_FLAGS \
        $0 $@

My Setup

Rootless Docker setup on Debian 11 Tools/versions:

Issue description

When I try to start the Docker daemon with systemctl --user start docker.service it fails with the message:

docker-rootless systemd[552]: Started Docker Application Container Engine (Rootless).
docker-rootless dockerd-rootless.sh[40009]: + [ -w /run/user/1000 ]
docker-rootless dockerd-rootless.sh[40009]: + [ -w /home/user.name ]
docker-rootless dockerd-rootless.sh[40009]: + rootlesskit=
docker-rootless dockerd-rootless.sh[40009]: + which docker-rootlesskit
docker-rootless dockerd-rootless.sh[40009]: + which rootlesskit
docker-rootless dockerd-rootless.sh[40009]: + rootlesskit=rootlesskit
docker-rootless dockerd-rootless.sh[40009]: + break
docker-rootless dockerd-rootless.sh[40009]: + [ -z rootlesskit ]
docker-rootless dockerd-rootless.sh[40009]: + :
docker-rootless dockerd-rootless.sh[40009]: + :
docker-rootless dockerd-rootless.sh[40009]: + : builtin
docker-rootless dockerd-rootless.sh[40009]: + : auto
docker-rootless dockerd-rootless.sh[40009]: + : auto
docker-rootless dockerd-rootless.sh[40009]: + net=
docker-rootless dockerd-rootless.sh[40009]: + mtu=
docker-rootless dockerd-rootless.sh[40009]: + [ -z ]
docker-rootless dockerd-rootless.sh[40009]: + which slirp4netns
docker-rootless dockerd-rootless.sh[40013]: + slirp4netns --help
docker-rootless dockerd-rootless.sh[40014]: + grep -qw -- --netns-type
docker-rootless dockerd-rootless.sh[40009]: + net=slirp4netns
docker-rootless dockerd-rootless.sh[40009]: + [ -z ]
docker-rootless dockerd-rootless.sh[40009]: + mtu=65520
docker-rootless dockerd-rootless.sh[40009]: + [ -z slirp4netns ]
docker-rootless dockerd-rootless.sh[40009]: + [ -z 65520 ]
docker-rootless dockerd-rootless.sh[40009]: + [ -z ]
docker-rootless dockerd-rootless.sh[40009]: + _DOCKERD_ROOTLESS_CHILD=1
docker-rootless dockerd-rootless.sh[40009]: + export _DOCKERD_ROOTLESS_CHILD
docker-rootless dockerd-rootless.sh[40009]: + exec rootlesskit --net=slirp4netns --mtu=65520 --slirp4netns-sandbox=auto --slirp4netns-seccomp=auto --disable-host-loopback --port-driver=builtin --copy-up=/etc --copy-up=/run --propagation=rslave /usr/share/docker.io/contrib/dockerd-rootless.sh
docker-rootless dockerd-rootless.sh[40046]: + [ -w /run/user/1000 ]
docker-rootless dockerd-rootless.sh[40046]: + [ -w /home/user.name ]
docker-rootless dockerd-rootless.sh[40046]: + rootlesskit=
docker-rootless dockerd-rootless.sh[40046]: + which docker-rootlesskit
docker-rootless dockerd-rootless.sh[40046]: + which rootlesskit
docker-rootless dockerd-rootless.sh[40046]: + rootlesskit=rootlesskit
docker-rootless dockerd-rootless.sh[40046]: + break
docker-rootless dockerd-rootless.sh[40046]: + [ -z rootlesskit ]
docker-rootless dockerd-rootless.sh[40046]: + :
docker-rootless dockerd-rootless.sh[40046]: + :
docker-rootless dockerd-rootless.sh[40046]: + : builtin
docker-rootless dockerd-rootless.sh[40046]: + : auto
docker-rootless dockerd-rootless.sh[40046]: + : auto
docker-rootless dockerd-rootless.sh[40046]: + net=
docker-rootless dockerd-rootless.sh[40046]: + mtu=
docker-rootless dockerd-rootless.sh[40046]: + [ -z ]
docker-rootless dockerd-rootless.sh[40046]: + which slirp4netns
docker-rootless dockerd-rootless.sh[40051]: + slirp4netns --help
docker-rootless dockerd-rootless.sh[40052]: + grep -qw -- --netns-type
docker-rootless dockerd-rootless.sh[40046]: + net=slirp4netns
docker-rootless dockerd-rootless.sh[40046]: + [ -z ]
docker-rootless dockerd-rootless.sh[40046]: + mtu=65520
docker-rootless dockerd-rootless.sh[40046]: + [ -z slirp4netns ]
docker-rootless dockerd-rootless.sh[40046]: + [ -z 65520 ]
docker-rootless dockerd-rootless.sh[40046]: + [ -z 1 ]
docker-rootless dockerd-rootless.sh[40046]: + [ 1 = 1 ]
docker-rootless dockerd-rootless.sh[40046]: + rm -f /run/docker /run/containerd /run/xtables.lock
docker-rootless dockerd-rootless.sh[40046]: + exec dockerd
docker-rootless dockerd-rootless.sh[40046]: time="2023-02-24T09:16:57.816517864+01:00" level=info msg="Starting up"
docker-rootless dockerd-rootless.sh[40046]: time="2023-02-24T09:16:57.816598983+01:00" level=warning msg="Running in rootless mode. This mode has feature limitations."
docker-rootless dockerd-rootless.sh[40046]: time="2023-02-24T09:16:57.816612443+01:00" level=info msg="Running with RootlessKit integration"
docker-rootless dockerd-rootless.sh[40046]: time="2023-02-24T09:16:57.818372244+01:00" level=info msg="libcontainerd: started new containerd process" pid=40059
docker-rootless dockerd-rootless.sh[40046]: time="2023-02-24T09:16:57.818424686+01:00" level=info msg="parsed scheme: \"unix\"" module=grpc
docker-rootless dockerd-rootless.sh[40046]: time="2023-02-24T09:16:57.818438345+01:00" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
docker-rootless dockerd-rootless.sh[40046]: time="2023-02-24T09:16:57.818471355+01:00" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///run/user/1000/docker/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}" module=grpc
docker-rootless dockerd-rootless.sh[40046]: time="2023-02-24T09:16:57.818492767+01:00" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.835783197+01:00" level=info msg="starting containerd" revision="1.4.13~ds1-1~deb11u2" version="1.4.13~ds1"
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.867273015+01:00" level=info msg="loading plugin \"io.containerd.content.v1.content\"..." type=io.containerd.content.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.867348734+01:00" level=info msg="loading plugin \"io.containerd.snapshotter.v1.aufs\"..." type=io.containerd.snapshotter.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.872877677+01:00" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.aufs\"..." error="aufs is not supported (modprobe aufs failed: exit status 1 \"modprobe: FATAL: Module aufs not found in directory /lib/modules/5.10.0-18-amd64\\n\"): skip plugin" type=io.containerd.snapshotter.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.872921880+01:00" level=info msg="loading plugin \"io.containerd.snapshotter.v1.btrfs\"..." type=io.containerd.snapshotter.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.873272967+01:00" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.btrfs\"..." error="path /home/user.name/.local/share/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs (ext4) must be a btrfs filesystem to be used with the btrfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.873298403+01:00" level=info msg="loading plugin \"io.containerd.snapshotter.v1.devmapper\"..." type=io.containerd.snapshotter.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.873342319+01:00" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.devmapper" error="devmapper not configured"
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.873360434+01:00" level=info msg="loading plugin \"io.containerd.snapshotter.v1.native\"..." type=io.containerd.snapshotter.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.873395765+01:00" level=info msg="loading plugin \"io.containerd.snapshotter.v1.overlayfs\"..." type=io.containerd.snapshotter.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.874544050+01:00" level=info msg="loading plugin \"io.containerd.snapshotter.v1.zfs\"..." type=io.containerd.snapshotter.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.874874542+01:00" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.zfs\"..." error="path /home/user.name/.local/share/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.874916554+01:00" level=info msg="loading plugin \"io.containerd.metadata.v1.bolt\"..." type=io.containerd.metadata.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.874946084+01:00" level=warning msg="could not use snapshotter devmapper in metadata plugin" error="devmapper not configured"
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.874962149+01:00" level=info msg="metadata content store policy set" policy=shared
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.875081236+01:00" level=info msg="loading plugin \"io.containerd.differ.v1.walking\"..." type=io.containerd.differ.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.875106937+01:00" level=info msg="loading plugin \"io.containerd.gc.v1.scheduler\"..." type=io.containerd.gc.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.875180310+01:00" level=info msg="loading plugin \"io.containerd.service.v1.introspection-service\"..." type=io.containerd.service.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.875308277+01:00" level=info msg="loading plugin \"io.containerd.service.v1.containers-service\"..." type=io.containerd.service.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.875331487+01:00" level=info msg="loading plugin \"io.containerd.service.v1.content-service\"..." type=io.containerd.service.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.875348777+01:00" level=info msg="loading plugin \"io.containerd.service.v1.diff-service\"..." type=io.containerd.service.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.875366278+01:00" level=info msg="loading plugin \"io.containerd.service.v1.images-service\"..." type=io.containerd.service.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.875383697+01:00" level=info msg="loading plugin \"io.containerd.service.v1.leases-service\"..." type=io.containerd.service.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.875401612+01:00" level=info msg="loading plugin \"io.containerd.service.v1.namespaces-service\"..." type=io.containerd.service.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.875417864+01:00" level=info msg="loading plugin \"io.containerd.service.v1.snapshots-service\"..." type=io.containerd.service.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.875434472+01:00" level=info msg="loading plugin \"io.containerd.runtime.v1.linux\"..." type=io.containerd.runtime.v1
docker-rootless dockerd-rootless.sh[40059]: time="2023-02-24T09:16:57.875517291+01:00" level=info msg="loading plugin \"io.containerd.runtime.v2.task\"..." type=io.containerd.runtime.v2
docker-rootless dockerd-rootless.sh[40046]: time="2023-02-24T09:16:58.818880199+01:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/user/1000/docker/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = \"transport: error while dialing: dial unix:///run/user/1000/docker/containerd/containerd.sock: timeout\". Reconnecting..." module=grpc
docker-rootless dockerd-rootless.sh[40046]: time="2023-02-24T09:17:01.465932794+01:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/user/1000/docker/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = \"transport: error while dialing: dial unix:///run/user/1000/docker/containerd/containerd.sock: timeout\". Reconnecting..." module=grpc
docker-rootless dockerd-rootless.sh[40046]: time="2023-02-24T09:17:05.717234361+01:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/user/1000/docker/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = \"transport: error while dialing: dial unix:///run/user/1000/docker/containerd/containerd.sock: timeout\". Reconnecting..." module=grpc
docker-rootless dockerd-rootless.sh[40046]: time="2023-02-24T09:17:10.749243432+01:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {unix:///run/user/1000/docker/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = \"transport: error while dialing: dial unix:///run/user/1000/docker/containerd/containerd.sock: timeout\". Reconnecting..." module=grpc
docker-rootless dockerd-rootless.sh[40046]: failed to start containerd: timeout waiting for containerd to start
docker-rootless dockerd-rootless.sh[40022]: [rootlesskit:child ] error: command [/usr/share/docker.io/contrib/dockerd-rootless.sh] exited: exit status 1
docker-rootless dockerd-rootless.sh[40009]: [rootlesskit:parent] error: child exited: exit status 1

When this happens

It happens only if there are running containers on Docker daemon stop. I see clearly within the journald log that on stop of the Docker daemon the containerd processes are killed. This leads that container directories within the /run/user/1000/docker/ directory are not cleaned up:

docker-rootless systemd[552]: Stopping Docker Application Container Engine (Rootless)...
docker-rootless dockerd-rootless.sh[37069]: time="2023-02-24T08:38:01.661787553+01:00" level=info msg="Processing signal 'terminated'"
docker-rootless dockerd-rootless.sh[37069]: time="2023-02-24T08:38:01.663588744+01:00" level=info msg="Daemon shutdown complete"
docker-rootless dockerd-rootless.sh[37069]: time="2023-02-24T08:38:01.663612700+01:00" level=info msg="stopping healthcheck following graceful shutdown" module=libcontainerd
docker-rootless dockerd-rootless.sh[37069]: time="2023-02-24T08:38:01.663854464+01:00" level=info msg="stopping event stream following graceful shutdown" error="context canceled" module=libcontainerd namespace=plugins.moby
docker-rootless dockerd-rootless.sh[37069]: time="2023-02-24T08:38:01.663948934+01:00" level=info msg="stopping event stream following graceful shutdown" error="context canceled" module=libcontainerd namespace=moby
docker-rootless systemd[552]: docker.service: Killing process 37692 (fuse-overlayfs) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37700 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37773 (fuse-overlayfs) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37782 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37702 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37703 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37704 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37705 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37706 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37707 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37708 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37747 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37756 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37783 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37784 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37785 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37786 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37787 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37788 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37789 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37790 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37792 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37847 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Killing process 37979 (containerd-shim) with signal SIGKILL.
docker-rootless systemd[552]: docker.service: Succeeded.
docker-rootless systemd[552]: Stopped Docker Application Container Engine (Rootless).

The systemd docker.service gets created by the dockerd-rootless-setup.sh script provided by the Docker installation package:

[Unit]
Description=Docker Application Container Engine (Rootless)
Documentation=https://docs.docker.com/engine/security/rootless/

[Service]
Environment=PATH=/usr/share/docker.io/contrib:/sbin:/usr/sbin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/usr/share/docker.io/contrib/
ExecStart=/usr/share/docker.io/contrib/dockerd-rootless.sh 
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=180
RestartSec=2
Restart=always
StartLimitBurst=3
StartLimitInterval=60s
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
Delegate=yes
Type=simple
KillMode=mixed

[Install]
WantedBy=default.target

Lets assume that I have a container running with the container id: 3a9723c5b2421bc661a9df3a85fc4003e1bbb20cfd1a57616632dceb4b0e5cc7. After stopping the Docker daemon the following directories are still there (with a content inside):

find /run/user/1000/docker -type d -name 3a9723c5b2421bc661a9df3a85fc4003e1bbb20cfd1a57616632dceb4b0e5cc7
/run/user/1000/docker/runtime-runc/moby/3a9723c5b2421bc661a9df3a85fc4003e1bbb20cfd1a57616632dceb4b0e5cc7
/run/user/1000/docker/containerd/3a9723c5b2421bc661a9df3a85fc4003e1bbb20cfd1a57616632dceb4b0e5cc7
/run/user/1000/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/3a9723c5b2421bc661a9df3a85fc4003e1bbb20cfd1a57616632dceb4b0e5cc7

Current workaround

I'm doing systemd ExecStop which stops all running containers before the processes started by the docker.service are stopped. But this isn't a solution imho. Especially when you would like to use live-restore.

How it behaves when live-restore is enabled, without stopping the containers beforehand

If I enable live-restore and do a docker.service stop, the process of the application keeps running, but the parent containerd-shim-runc-v2 process gets killed:

/usr/bin/containerd-shim-runc-v2 -namespace moby -id 3a9723c5b2421bc661a9df3a85fc4003e1bbb20cfd1a57616632dceb4b0e5cc7 -address /run/user/1000/docker/containerd/containerd.sock

Then because a start will fail I have to clean/delete the folders mentioned above in order to start the Docker daemon again. This leads to a container in a stopped state although the container process is still running. If I start the container with docker start a second process is started and that is a no go imho.

What I would expect

I wish that stopping the Docker daemon will do the necessary steps whatever they are in order to be able to start the Docker daemon afterwards without further workaround.

hinshun commented 1 year ago

Pretty sure this is because you have KillMode=mixed, use KillMode=process to avoid killing its child processes (like containerd-shim).