projectatomic / container-storage-setup

Service to set up storage for Docker and other container systems
Apache License 2.0
153 stars 77 forks source link

Cloud-Init triggered install/start of docker will always hang #77

Open james-masson opened 9 years ago

james-masson commented 9 years ago

If Cloud-init is used to trigger installs of docker, docker startup, and provisioning in general will just hang. This is irrespective of what the actual method of install/config is - ie. Puppet/Ansible etc.

Expanding on that:

The result is that provisioning cannot complete until cloud-init has failed/died - or the box has been rebooted.

Commenting out After=cloud-final.service makes this problem go away.

I understand the thinking behind the requirement - basically, ensure the box has had it's storage configured by Cloud-init, to be used by docker-storage-setup - but the existing restriction makes more sophisticated hands-off provisioning awkward.

I'm not sure what to suggest as a solution here, sorry!

cgwalters commented 9 years ago

Thanks for the excellent issue report. Unfortunately, the After=cloud-final.service was part of the initial commit, and there's no comment, so we are left to try to retroactively determine its reason for existence.

Offhand...I think we might have been waiting for cloud-init to do the growpart bit. But we do that internally now.

I am thinking it'd be safe to just remove that After=...but let's spend a bit of time to try to consider the repercussions.

larsks commented 8 years ago

The typical way of solving this is to pass the --no-block flag to systemctl, e.g.,

systemctl --no-block start docker

This will enqueue the start request and return immediately. Of course, docker won't start until its dependencies are satisfied, so if you require docker to be running while cloud-init is still processing you're out of luck.

This is tricky if you're using the built-in "service" abstractions in ansible/puppet/etc, which may not have any facility for using the --no-block flag.

akostadinov commented 8 years ago

--no-block can only help if you don't do anything with docker in your scripts. If you want to actually confugure or run something, then you are out of luck. Trying now:

runcmd:
...
- [ systemctl, enable, docker.service ]
- [ systemctl, start, docker-storage-setup.service, --ignore-dependencies ]
- [ systemctl, start, docker.service, --ignore-dependencies ]
...

P.S. Tested and works for me. I think docker-storate-setup can be skipped as it seems useless without setting configuration. But leaving it above for completeness. P.P.S. @smoser from #cloud-init also suggested as a possible workaround to create a systemd service using bootcmd: []. And that service would run the things that need interaction with docker after docker has launched (provided you create it properly). And systemd should pick any service created with a boothook without need to enable with a runcmd.

HackToday commented 8 years ago

@akostadinov

for your mentioned bootcmd: is it really helps for cases below? In cloud-init, we call systemd service, docker, right now, we use no-block.

But another issue comes for no-block, in cloud-init, we also need to run docker run *** something to install(for example to install nsenter). As docker service is scheduled to run after cloud-init, but cloud-init also have scripts want to call docker service(to start container), so this script would failed as docker not started at that time.

Do you have any good suggestion for that ?

akostadinov commented 8 years ago

@HackToday , using --ignore-dependencies worked for me. but tbh switched to ansible setup eventually. If you need machine restart for example, cloud-init becomes a no-go. E.g. updating system in RPM based system does not support reboot yet.

jlebon commented 7 years ago

Actually, now that docker-storage-setup.service no longer depends on cloud-final.service, this should no longer be an issue (see #161). This works for me on Fedora 25:

runcmd:
  - systemctl start docker
  - docker pull busybox
  - docker run -d busybox sleep 999

If you want to change the default d-s-s configuration, you can still do so in the bootcmd section.

jlebon commented 7 years ago

Even better, since cloud-final.service has an After on multi-user.target, you don't even have to do systemctl start docker first in the above.