mongrelion / ansible-role-docker

Ansible role for installing Docker
MIT License
62 stars 39 forks source link

Rancher's installation script is broken on Ubuntu 17.10 #23

Closed martinandersson closed 6 years ago

martinandersson commented 6 years ago

I would like to dump an issue over at Rancher's repo instead but they don't accept new issues lol.

Either way.. my requirements.yml:

- src: mongrelion.docker   version: a7040aac63c8567048a3fbb6437199cca40457aa

playbook.yml:

roles: - role: mongrelion.docker   docker_version: '17.09'   setup_script_md5_sum: 975145b3eeaf9efc588666bf46265d38   vagrant: yes

This used to work just fine, but now it won't work anymore after I upgraded my Vagrant boxes to fso/arful64. I greatly suspect Rancher's installation script is broken for Ubuntu 17.10. The Docker installation fails when executing the script. I actually tried to run said installation script manually with the same result. These are the last 10 or so lines from the console output, and please note in particular the very last line:

[...]
+ sh -c apt-key add -
+ curl -fsSl https://download.docker.com/linux/ubuntu/gpg
OK
+ sh -c add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu artful stable"
+ [ ubuntu = debian ]
+ sh -c apt-get update
Hit:1 http://us.archive.ubuntu.com/ubuntu artful InRelease
Hit:2 http://security.ubuntu.com/ubuntu artful-security InRelease
Hit:3 http://us.archive.ubuntu.com/ubuntu artful-updates InRelease
Hit:4 https://download.docker.com/linux/ubuntu artful InRelease
Hit:5 http://us.archive.ubuntu.com/ubuntu artful-backports InRelease
Hit:6 http://ppa.launchpad.net/ansible/ansible/ubuntu artful InRelease
Reading package lists... Done
+ cut -d   -f 4
+ head -n 1
+ grep 17.09.0
+ apt-cache madison docker-ce
+ sh -c apt-get install -y -q docker-ce=
Reading package lists...
Building dependency tree...
Reading state information...
E: Version '' for 'docker-ce' was not found

Executing the "convenience scripts" provided today (2017-11-13) by https://get.docker.com (17.10.0-ce) and https://test.docker.com (17.11.0-ce-rc3, build 5b4af4f) works just fine.

So my current workaround is to switch the mongrelion.docker's setup_script_url to one of Docker's convenience scripts. I also set setup_script_md5_sum to false since these scripts are expected to change over time lol (thanx for adding that feature!).

mongrelion commented 6 years ago

@marcusianlevine can you take a look at this? this is your doing ;P

marcusianlevine commented 6 years ago

@MartinanderssonDotcom thanks for using our role and raising this issue!

Your workaround is actually the intended design of this role: to avoid maintaining bloated OS-specific installs within the role, we just wrap the convenience script of your choosing.

We chose to use Rancher's scripts by default because they maintain an archive of scripts for specific versions of Docker. By contrast the official Docker convenience scripts that you linked change over time as new versions of Docker are released to the stable and edge channels.

I don't think there's much we can do about this within the scope of this role, but according to the docs the latest version of Rancher (1.6) only officially supports up to Docker 17.06, so it's possible they aren't actively maintaining the 17.09 script yet.

Since there's logic in the role to accommodate this use-case, I think we can close this issue.

martinandersson commented 6 years ago

No, thank you! =)

I feel ya. But at least y'all have to toss in a note about this in the docs. We can't let thousands of ppl pull the hair out for nothing lol.

martinandersson commented 6 years ago

Also.. just as a general comment. I'm not sure I agree with the role being a "wrapper" on top of someone else's script. I mean, sure that's how it is implemented. I get it. BUT.. roles in general are supposed to get the job done, however that happens. Meaning that the role itself should fall back to or employ whatever script slash shell commands are necessary.

Maybe we should rebrand this issue as an enhancement request instead of closing it?

mongrelion commented 6 years ago

I think that @MartinanderssonDotcom makes a valid point when he says that the role should do whatever it takes to get the job done. The problem that we have at hand, though, is the underlying technology that we chose to get that job done, which has this very specific limitation as of right now. The best solution would be to submit a PR to the upstream script for the Rancher folks. Fixing this problem from within the role is just a hack and will probably lead to unclean code.

As per your suggestion, I think it's only fair with the users to toss that note in the docs. @MartinanderssonDotcom could you please submit a PR with whatever you think is best in the docs?

@marcusianlevine or @MartinanderssonDotcom maybe you guys also want to look into the Github source of the Rancher script (if there is any?) and submit that PR ;P

marcusianlevine commented 6 years ago

Was digging into this some more, looks like the real issue is not with Rancher's install script but actually upstream in the Docker apt repositories: there is not yet a package published for Ubuntu 17.10 Artful.

That's why the error reads E: Version '' for 'docker-ce' was not found - there simply is no compatible version of the docker-ce package published on the official Docker apt repo.

@MartinanderssonDotcom don't mean to be dismissive of your issue, but we had this same problem with Ubuntu Zesty a few months ago until Docker released an apt package for it.

@mongrelion unless we intend to build Docker binaries from source within the role I'm not sure there's much we can do about this, other than add a note to the docs that we only support OSs with an official Docker package on apt or yum and to try the official Docker convenience scripts for new releases of Docker or host OS.

On a related note, our meta/main.yml doesn't list Zesty as supported yet, and there is now a Zesty package published, so we could update that. tried with Vagrant locally and still doesn't seem to work though

marcusianlevine commented 6 years ago

Hmm I was trying to figure out how Docker's convenience scripts work.. looks like they use a different apt repo than the Rancher setup scripts, with packages for more recent OS releases...

Still requires an upstream fix to Rancher's setup scripts, but should only affect Ubuntu 17.10, not lower versions...

mongrelion commented 6 years ago

other than add a note to the docs that we only support OSs with an official Docker package on apt or yum Sounds about fair

tried with Vagrant locally and still doesn't seem to work though :(

So correct me if I'm wrong: there is no solution just yet for getting the role to work on Ubuntu 17.10 since there is no official Docker package for it just yet. Correct?

marcusianlevine commented 6 years ago

@mongrelion it's a little more complicated than that

Docker's official apt/yum repos support up through Ubuntu 17.10 Artful, but only offer the latest stable and edge releases of Docker (right now 17.09 and 17.11, respectfully). This repo is used by the official Docker convenience scripts, which do work to install the latest versions of Docker on Ubuntu 17.04 Zesty and 17.10 Artful.

So as @martinanderssondotcom said, if you specify the official Docker convenience scripts with setup_script_url, you can install the latest official releases of Docker on Ubuntu Zesty and Artful

However, the Rancher convenience scripts that we use by default pull from a less-frequently-updated apt repo which contains releases for many versions of Docker on all of our currently supported OS distros and versions.

Weirdly, according to the filelist for Zesty, there is only an installer for Docker 17.05, which is why the Zesty install fails with our current default of 17.06

I think the simplest solution that addresses the most use-cases is to switch the default setup script to Docker's official convenience scripts.

This will remove the need to manually upgrade the default version - we could even have docker_version be used to specify stable or edge instead of an absolute version number. Since most people who need a specific version of Docker will know what they are doing, it's not such a big deal to have to look up a Rancher setup script and calculate the checksum.

The only downside I see is that we will lose some determinism: we could experience problems with the role related to a botched release of the Docker convenience script or Docker itself. Also, we will not be able to reproduce test results based on the official convenience script, since it changes over time and there is no archive.

However, this could be mitigated in our to-be-written integration tests by using the Rancher scripts on known working OSs and only fall back to the official script if they fail.

The ultimate solution would be to implement our own install logic rather than relying on third-party scripts, but as we've discussed previously this would require a significant amount of new work to cover the OS versions and distros we currently support.

To be clear: there is a functioning workaround for this issue, by using the official Docker convenience scripts

mongrelion commented 6 years ago

@marcusianlevine at this point I'm not really sure which way to go. Rewriting the whole thing can be expensive. Not supporting the latest Ubuntu version could also annoy people but so far we only have one complaint.

It's been about a month already. Is this not yet supported by the Rancher scripts? #wishfulthinking

martinandersson commented 6 years ago

Tried to install 17.09, same problem.

But, I also tried the latest and greatest 17.12 and this guy works just fine.

playbook.yml:

roles: - role: mongrelion.docker   docker_version: '17.12'   setup_script_md5_sum: 9adc4109fa5115bf6f7eef8d3e4ae950   vagrant: yes

Given what a pain in the ass it is to fix obsolete issues, I guess we can label this one "wontfix" and close it?

marcusianlevine commented 6 years ago

Agreed @martinanderssondotcom