docker / for-linux

Docker Engine for Linux
https://docs.docker.com/engine/installation/
757 stars 86 forks source link

Docker fails immediately after installation on Debian 9 on Google Cloud Platform, but runs fine on subsequent starts #351

Open stuart-c-moore opened 6 years ago

stuart-c-moore commented 6 years ago

Expected behavior

Docker starts immediately after installation

Actual behavior

Docker fails immediately after installation on Debian 9 on Google Cloud Platform. Docker is then restarted by systemd, and then works fine.

Unfortunately, this means that if the installation of the docker-ce package is run with a script containing set -e, that script will fail.

Steps to reproduce the behavior

The virtual machine is destroyed and rebuilt to provide clean results.

After the startup-script is run (contents shown below), the following is seen in sudo journalctl -u docker.service:

$ sudo journalctl -u docker.service
-- Logs begin at Thu 2018-06-28 15:20:48 UTC, end at Thu 2018-06-28 15:55:14 UTC. --
Jun 28 15:22:19 bastion-eu-be systemd[1]: Starting Docker Application Container Engine...
Jun 28 15:22:19 bastion-eu-be dockerd[23487]: no sockets found via socket activation: make sure the service was started by systemd
Jun 28 15:22:19 bastion-eu-be systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Jun 28 15:22:19 bastion-eu-be systemd[1]: Failed to start Docker Application Container Engine.
Jun 28 15:22:19 bastion-eu-be systemd[1]: docker.service: Unit entered failed state.
Jun 28 15:22:19 bastion-eu-be systemd[1]: docker.service: Failed with result 'exit-code'.
Jun 28 15:22:20 bastion-eu-be systemd[1]: docker.service: Service hold-off time over, scheduling restart.
Jun 28 15:22:20 bastion-eu-be systemd[1]: Stopped Docker Application Container Engine.
Jun 28 15:22:20 bastion-eu-be systemd[1]: Starting Docker Application Container Engine...
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20.068094024Z" level=info msg="libcontainerd: started new docker-containerd process" pid=23523
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="starting containerd" module=containerd revision=773c489c9c1b21a6d78b5c538cd395416ec50f88 version=v1.0.3
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="loading plugin "io.containerd.content.v1.content"..." module=containerd type=io.containerd.content.v1
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.btrfs"..." module=containerd type=io.containerd.snapshotter.v1
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.btrfs" error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs must be a btrfs filesystem to be used with the btrfs snapshotter" module=containerd
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.overlayfs"..." module=containerd type=io.containerd.snapshotter.v1
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="loading plugin "io.containerd.metadata.v1.bolt"..." module=containerd type=io.containerd.metadata.v1
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=warning msg="could not use snapshotter btrfs in metadata plugin" error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs must be a btrfs filesystem to be used with the btrfs snapshotter" module="containerd/io.conta
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="loading plugin "io.containerd.differ.v1.walking"..." module=containerd type=io.containerd.differ.v1
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="loading plugin "io.containerd.gc.v1.scheduler"..." module=containerd type=io.containerd.gc.v1
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="loading plugin "io.containerd.grpc.v1.containers"..." module=containerd type=io.containerd.grpc.v1
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="loading plugin "io.containerd.grpc.v1.content"..." module=containerd type=io.containerd.grpc.v1
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="loading plugin "io.containerd.grpc.v1.diff"..." module=containerd type=io.containerd.grpc.v1
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="loading plugin "io.containerd.grpc.v1.events"..." module=containerd type=io.containerd.grpc.v1
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="loading plugin "io.containerd.grpc.v1.healthcheck"..." module=containerd type=io.containerd.grpc.v1
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="loading plugin "io.containerd.grpc.v1.images"..." module=containerd type=io.containerd.grpc.v1
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="loading plugin "io.containerd.grpc.v1.leases"..." module=containerd type=io.containerd.grpc.v1
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="loading plugin "io.containerd.grpc.v1.namespaces"..." module=containerd type=io.containerd.grpc.v1
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="loading plugin "io.containerd.grpc.v1.snapshots"..." module=containerd type=io.containerd.grpc.v1
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="loading plugin "io.containerd.monitor.v1.cgroups"..." module=containerd type=io.containerd.monitor.v1
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="loading plugin "io.containerd.runtime.v1.linux"..." module=containerd type=io.containerd.runtime.v1
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="loading plugin "io.containerd.grpc.v1.tasks"..." module=containerd type=io.containerd.grpc.v1
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="loading plugin "io.containerd.grpc.v1.version"..." module=containerd type=io.containerd.grpc.v1
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="loading plugin "io.containerd.grpc.v1.introspection"..." module=containerd type=io.containerd.grpc.v1
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg=serving... address="/var/run/docker/containerd/docker-containerd-debug.sock" module="containerd/debug"
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg=serving... address="/var/run/docker/containerd/docker-containerd.sock" module="containerd/grpc"
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20Z" level=info msg="containerd successfully booted in 0.029441s" module=containerd
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20.247999622Z" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20.248293931Z" level=warning msg="Your kernel does not support swap memory limit"
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20.248464621Z" level=warning msg="Your kernel does not support cgroup rt period"
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20.248479032Z" level=warning msg="Your kernel does not support cgroup rt runtime"
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20.250999585Z" level=info msg="Loading containers: start."
Jun 28 15:22:20 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:20.892057907Z" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Jun 28 15:22:21 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:21.012681272Z" level=info msg="Loading containers: done."
Jun 28 15:22:21 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:21.032421511Z" level=info msg="Docker daemon" commit=9ee9f40 graphdriver(s)=overlay2 version=18.03.1-ce
Jun 28 15:22:21 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:21.035457354Z" level=info msg="Daemon has completed initialization"
Jun 28 15:22:21 bastion-eu-be systemd[1]: Started Docker Application Container Engine.
Jun 28 15:22:21 bastion-eu-be dockerd[23519]: time="2018-06-28T15:22:21.074759070Z" level=info msg="API listen on /var/run/docker.sock"
Jun 28 15:26:29 bastion-eu-be dockerd[23519]: time="2018-06-28T15:26:29Z" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/55d07bbf9c81594aeb852b7dd87666a105980205bcd09175d250625ba6e583e2/shim.sock" debug=false module="containerd/tasks" pid=27064
Jun 28 15:26:29 bastion-eu-be dockerd[23519]: time="2018-06-28T15:26:29Z" level=info msg="shim reaped" id=55d07bbf9c81594aeb852b7dd87666a105980205bcd09175d250625ba6e583e2 module="containerd/tasks"
Jun 28 15:26:29 bastion-eu-be dockerd[23519]: time="2018-06-28T15:26:29.357887759Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"

This is the output of /var/log/syslog:

Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script: Setting up docker-ce (18.03.1~ce-0~debian) ...
Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script: Created symlink /etc/systemd/system/multi-user.target.wants/docker.service → /lib/systemd/system/docker.service.
Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script: Created symlink /etc/systemd/system/sockets.target.wants/docker.socket → /lib/systemd/system/docker.socket.
Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script: Job for docker.service failed because the control process exited with error code.
Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script: See "systemctl status docker.service" and "journalctl -xe" for details.
Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script: invoke-rc.d: initscript docker, action "start" failed.
Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script: ● docker.service - Docker Application Container Engine
Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script:    Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script:    Active: activating (auto-restart) (Result: exit-code) since Thu 2018-06-28 15:22:19 UTC; 16ms ago
Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script:      Docs: https://docs.docker.com
Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script:   Process: 23487 ExecStart=/usr/bin/dockerd -H fd:// (code=exited, status=1/FAILURE)
Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script:  Main PID: 23487 (code=exited, status=1/FAILURE)
Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script:       CPU: 37ms
Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script:
Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script: Jun 28 15:22:19 bastion-eu-be systemd[1]: docker.service: Main process exite…URE
Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script: Jun 28 15:22:19 bastion-eu-be systemd[1]: Failed to start Docker Application…ne.
Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script: Jun 28 15:22:19 bastion-eu-be systemd[1]: docker.service: Unit entered faile…te.
Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script: Jun 28 15:22:19 bastion-eu-be systemd[1]: docker.service: Failed with result…e'.
Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script: Hint: Some lines were ellipsized, use -l to show in full.
Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script: dpkg: error processing package docker-ce (--configure):
Jun 28 15:22:19 bastion-eu-be startup-script: INFO startup-script:  subprocess installed post-installation script returned error exit status 1

Output of docker version:

$ sudo docker version
Client:
 Version:      18.03.1-ce
 API version:  1.37
 Go version:   go1.9.5
 Git commit:   9ee9f40
 Built:        Thu Apr 26 07:17:14 2018
 OS/Arch:      linux/amd64
 Experimental: false
 Orchestrator: swarm

Server:
 Engine:
  Version:      18.03.1-ce
  API version:  1.37 (minimum version 1.12)
  Go version:   go1.9.5
  Git commit:   9ee9f40
  Built:        Thu Apr 26 07:15:24 2018
  OS/Arch:      linux/amd64
  Experimental: false

Output of docker info:

$ sudo docker info
Containers: 1
 Running: 0
 Paused: 0
 Stopped: 1
Images: 1
Server Version: 18.03.1-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 773c489c9c1b21a6d78b5c538cd395416ec50f88
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.9.0-6-amd64
Operating System: Debian GNU/Linux 9 (stretch)
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 3.617GiB
Name: bastion-eu-be
ID: TJKO:P5ZA:FLMZ:PIGD:MXN4:EWL3:ATFI:V7VQ:S6ZF:MBBR:IE5V:5DJT
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

WARNING: No swap limit support

The full contents of the startup script used for the VM:

#!/usr/bin/env bash

# Install OS patches
apt update
apt upgrade -y

# Install Docker
# See https://docs.docker.com/install/linux/docker-ce/debian/#install-docker-ce-1
apt install -y \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg2 \
    software-properties-common
curl -fsSL https://download.docker.com/linux/debian/gpg | apt-key add -
add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/debian \
   $$(lsb_release -cs) \
   stable"
apt update
DEBIAN_FRONTEND=noninteractive apt install -y docker-ce

# Install stackdriver tools
curl -O "https://repo.stackdriver.com/stack-install.sh"
bash stack-install.sh --write-gcm
justyntemme commented 6 years ago

Testing on Debian 9 via digital ocean I was not able to get docker installed with the given scripts. Please let me know if you're scripts are still successful I will test again.

mogthesprog commented 6 years ago

I'm also seeing this with exactly the same installation steps.

I think it may be a race condition in between docker.socket and docker.service but i'm not getting enough logs to confirm this.

See these logs:

Aug 29 21:53:18 dev-jira-84dj systemd[1]: Starting Docker Application Container Engine...
Aug 29 21:53:18 dev-jira-84dj dockerd[29285]: Failed to load listeners: no sockets found via socket activation: make sure the service was started by systemd
Aug 29 21:53:18 dev-jira-84dj systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Aug 29 21:53:18 dev-jira-84dj systemd[1]: Failed to start Docker Application Container Engine.
Aug 29 21:53:18 dev-jira-84dj systemd[1]: docker.service: Unit entered failed state.

I can't seem to get any more verbosity out of it despite googling.

lexelby commented 6 years ago

It's definitely a race condition, but I'm seeing it consistently. @mogthesprog I think you're on the right track there. Check out this snippet from /var/lib/dpkg/info/docker-ce.postinst:

# Automatically added by dh_systemd_enable
# This will only remove masks created by d-s-h on package removal.
deb-systemd-helper unmask docker.service >/dev/null || true

# was-enabled defaults to true, so new installations run enable.
if deb-systemd-helper --quiet was-enabled docker.service; then
        # Enables the unit on first installation, creates new
        # symlinks on upgrades if the unit file has changed.
        deb-systemd-helper enable docker.service >/dev/null || true
else
        # Update the statefile to add new symlinks (if any), which need to be
        # cleaned up on purge. Also remove old symlinks.
        deb-systemd-helper update-state docker.service >/dev/null || true
fi
# End automatically added section
# Automatically added by dh_systemd_enable
# This will only remove masks created by d-s-h on package removal.
deb-systemd-helper unmask docker.socket >/dev/null || true

# was-enabled defaults to true, so new installations run enable.
if deb-systemd-helper --quiet was-enabled docker.socket; then
        # Enables the unit on first installation, creates new
        # symlinks on upgrades if the unit file has changed.
        deb-systemd-helper enable docker.socket >/dev/null || true
else
        # Update the statefile to add new symlinks (if any), which need to be
        # cleaned up on purge. Also remove old symlinks.
        deb-systemd-helper update-state docker.socket >/dev/null || true
fi
# End automatically added section

docker.service declares a dependency on docker.socket, but apparently that's ignored if docker.socket isn't enabled. By running journalctl (to view the entire system log with all services interleaved), I can see that docker.service does indeed try to start up before docker.socket. It fails, and by then docker.socket has started, and then docker.service is able to start successfully on the next try.

As @migs pointed out, the apt-get install fails which means that whatever script is invoking it also fails. In my case, that's Chef, and I get a failed chef run. :(

lexelby commented 6 years ago

It seems almost as if this is a bug in systemd (for allowing docker.service to start even though it Requires something that's not enabled) or dh_systemd_enable (for ignoring dependencies and ordering).

lexelby commented 6 years ago

Workaround: "mask" docker.service before installing docker-ce and unmask it after. Masking is described in the third bullet point here.

In chef, it looks like this:

systemd_unit 'docker.service' do
  action :mask
end

docker_installation_package 'default' do
  action :create
end

systemd_unit 'docker.service' do
  action :unmask
end

docker_service_manager_systemd 'default' do
  action [:start]
end
njam commented 2 years ago

I experience the same with Debian 10 on GCP (the ticket is about Debian 9).

njam commented 2 years ago

Thanks @lexelby for the workaround using "mask". Works for me.

To install docker on Debian 10 on GCP I use:

systemctl mask docker.service
apt-get install -qy docker-ce
systemctl unmask docker.service
systemctl enable --now docker.service