Open stuart-c-moore opened 6 years ago
Testing on Debian 9 via digital ocean I was not able to get docker installed with the given scripts. Please let me know if you're scripts are still successful I will test again.
I'm also seeing this with exactly the same installation steps.
I think it may be a race condition in between docker.socket
and docker.service
but i'm not getting enough logs to confirm this.
See these logs:
Aug 29 21:53:18 dev-jira-84dj systemd[1]: Starting Docker Application Container Engine...
Aug 29 21:53:18 dev-jira-84dj dockerd[29285]: Failed to load listeners: no sockets found via socket activation: make sure the service was started by systemd
Aug 29 21:53:18 dev-jira-84dj systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Aug 29 21:53:18 dev-jira-84dj systemd[1]: Failed to start Docker Application Container Engine.
Aug 29 21:53:18 dev-jira-84dj systemd[1]: docker.service: Unit entered failed state.
I can't seem to get any more verbosity out of it despite googling.
It's definitely a race condition, but I'm seeing it consistently. @mogthesprog I think you're on the right track there. Check out this snippet from /var/lib/dpkg/info/docker-ce.postinst
:
# Automatically added by dh_systemd_enable
# This will only remove masks created by d-s-h on package removal.
deb-systemd-helper unmask docker.service >/dev/null || true
# was-enabled defaults to true, so new installations run enable.
if deb-systemd-helper --quiet was-enabled docker.service; then
# Enables the unit on first installation, creates new
# symlinks on upgrades if the unit file has changed.
deb-systemd-helper enable docker.service >/dev/null || true
else
# Update the statefile to add new symlinks (if any), which need to be
# cleaned up on purge. Also remove old symlinks.
deb-systemd-helper update-state docker.service >/dev/null || true
fi
# End automatically added section
# Automatically added by dh_systemd_enable
# This will only remove masks created by d-s-h on package removal.
deb-systemd-helper unmask docker.socket >/dev/null || true
# was-enabled defaults to true, so new installations run enable.
if deb-systemd-helper --quiet was-enabled docker.socket; then
# Enables the unit on first installation, creates new
# symlinks on upgrades if the unit file has changed.
deb-systemd-helper enable docker.socket >/dev/null || true
else
# Update the statefile to add new symlinks (if any), which need to be
# cleaned up on purge. Also remove old symlinks.
deb-systemd-helper update-state docker.socket >/dev/null || true
fi
# End automatically added section
docker.service
declares a dependency on docker.socket
, but apparently that's ignored if docker.socket
isn't enabled. By running journalctl
(to view the entire system log with all services interleaved), I can see that docker.service
does indeed try to start up before docker.socket
. It fails, and by then docker.socket
has started, and then docker.service
is able to start successfully on the next try.
As @migs pointed out, the apt-get install
fails which means that whatever script is invoking it also fails. In my case, that's Chef, and I get a failed chef run. :(
It seems almost as if this is a bug in systemd (for allowing docker.service to start even though it Requires something that's not enabled) or dh_systemd_enable
(for ignoring dependencies and ordering).
Workaround: "mask" docker.service
before installing docker-ce
and unmask it after. Masking is described in the third bullet point here.
In chef, it looks like this:
systemd_unit 'docker.service' do
action :mask
end
docker_installation_package 'default' do
action :create
end
systemd_unit 'docker.service' do
action :unmask
end
docker_service_manager_systemd 'default' do
action [:start]
end
I experience the same with Debian 10 on GCP (the ticket is about Debian 9).
Thanks @lexelby for the workaround using "mask". Works for me.
To install docker on Debian 10 on GCP I use:
systemctl mask docker.service
apt-get install -qy docker-ce
systemctl unmask docker.service
systemctl enable --now docker.service
Expected behavior
Docker starts immediately after installation
Actual behavior
Docker fails immediately after installation on Debian 9 on Google Cloud Platform. Docker is then restarted by systemd, and then works fine.
Unfortunately, this means that if the installation of the
docker-ce
package is run with a script containingset -e
, that script will fail.Steps to reproduce the behavior
The virtual machine is destroyed and rebuilt to provide clean results.
After the startup-script is run (contents shown below), the following is seen in
sudo journalctl -u docker.service
:This is the output of
/var/log/syslog
:Output of
docker version
:Output of
docker info
:The full contents of the startup script used for the VM: