paris-kaman-grainger commented 2 years ago

Hello,

Our team noticed that when the docker snap refreshes, it will send a signal to all containers to shut down, and then shut down containerd. After a few seconds of waiting for the containers to fully shut down, dockerd will attempt to force kill the remaining containers. At this point, containerd is already shut down so it has no socket open to communicate with the running containers. This leaves dangling processes that seem to prevent dockerd from starting up successfully when refreshing. We happen to be observing this in our solution with a Chromium container that is waiting for the X-server container to shut down first, however we believe this issue would likely apply to any container that refuses to shut down.

tianon commented 2 years ago

:grimacing:

I think this is probably an unfortunate quirk of the Snap being essentially a container itself, but perhaps there's something clever that could be done here during upgrades?

(Docker itself doesn't support live-upgrades between different versions, which is what would be necessary for this to work in the general case, I think :see_no_evil:)

paris-kaman-grainger commented 2 years ago

It seems like the main thing required with the current setup is during the refresh to wait for all containers to fully shut down before stopping the dockerd service. Perhaps the pre-refresh hook could attempt to stop the containers and only wait for them all to shut down fully before letting the system move onto stopping the service.

alexfisher commented 2 years ago

Same issue here. Trying to figure-out if there's a way to at least increase the timeout so dockerd doesn't kill containers after 10 seconds.

xnox commented 1 year ago

@paris-kaman-grainger

In the docker .deb packaging the systemd unit has the following settings:

[Service]
TimeoutSec=0
RestartSec=2
Restart=always

# Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229.
# Both the old, and new location are accepted by systemd 229 and up, so using the old location
# to make them work for either version of systemd.
StartLimitBurst=3

# Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230.
# Both the old, and new name are accepted by systemd 230 and up, so using the old name to make
# this option work for either version of systemd.
StartLimitInterval=60s

# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity

# Comment TasksMax if your systemd version does not support it.
# Only systemd 226 and above support this option.
TasksMax=infinity

# kill only the docker process, not all processes in the cgroup
KillMode=process
OOMScoreAdjust=-500

Meaning, as far as I understand it, all containers stay up upon upgrades of docker.deb whilst the daemon is restarted.

That is not the case with the snap.docker.dockerd.service unit. I don't think these settings are possibly to declare in snapcraft.yaml. I wonder if this issue can be resolved with a local / system level customization (for example, if you have ownership of devices to execute the below on them, or for example integrate it as part of the install-device hook of your custom gadget)

$ sudo mkdir -p /etc/systemd/system/snap.docker.dockerd.service.d/
$ cat <<EOF | sudo tee /etc/systemd/system/snap.docker.dockerd.service.d/override.conf
# https://github.com/docker-snap/docker-snap/issues/76
[Service]
TimeoutSec=0
RestartSec=2
Restart=always

# Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229.
# Both the old, and new location are accepted by systemd 229 and up, so using the old location
# to make them work for either version of systemd.
StartLimitBurst=3

# Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230.
# Both the old, and new name are accepted by systemd 230 and up, so using the old name to make
# this option work for either version of systemd.
StartLimitInterval=60s

# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity

# Comment TasksMax if your systemd version does not support it.
# Only systemd 226 and above support this option.
TasksMax=infinity

# kill only the docker process, not all processes in the cgroup
KillMode=process
OOMScoreAdjust=-500
EOF

And then reboot the device. And from then on, snap refresh of docker should keep the containers up in a consistent manner. One can then experiment with refreshing docker snap to particular revisions using snap refresh --revision.

PeterSurda commented 1 year ago

@xnox It doesn't look like your suggestion is working (Ubuntu 22.04). I tried it both on x86_64 and aarch64.

PeterSurda commented 11 months ago

I found a workaround, it may not help everyone as it still requires some glue, but for me it is sufficient. It's described here: https://unix.stackexchange.com/questions/723648/systemd-make-service-a-to-start-b

TLDR; you create a new systemd service that will bring up the necessary containers, make it WantedBy=snap.docker.dockerd.service and enable that service. Then whenever snap dockerd starts / restarts, this service is started also. Since I manage my containers with ansible-pull, I just have that new service run ansible-pull as well.

canonical / docker-snap

Dockerd shuts down containerd before all containers are down #76

@paris-kaman-grainger