Open jjzazuet opened 5 years ago
Hello,
I'm afraid that I'm not sure why this happens. The role hasn't been changed much, besides this seems to be an issue with finding the roles themselves, ie. a problem with Ansible configuration. Since you are using a custom playbook, I assume that you installed the roles "manually" somewhere, is the main role named debops.pki
? Can you show the playbook that you are using and the contents of the roles_path
variable in ansible.cfg
? Check if the role is in one of the directories listed there.
The development of DebOps codebase has shifted to a monorepo, you might look into it to get the latest changes. The standalone roles will at some point be archived on GitHub.
Hi @drybjed , thanks for the tip. Yes, these are the commands I'm using on the Debian deployment box to prepare the environment to execute Ansible playbooks:
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 93C4A3FD7BB9C367
echo 'deb http://ppa.launchpad.net/ansible/ansible/ubuntu trusty main' | tee /etc/apt/sources.list.d/ansible.list
apt-get update -y; apt-get install -y git ansible python-pip
pip install netaddr
ansible-galaxy install esolitos.resolv;
ansible-galaxy install holms.fqdn;
ansible-galaxy install debops.secret;
ansible-galaxy install debops.pki;
ansible-galaxy install debops.grub;
ansible-galaxy install debops-contrib.apparmor;
ansible-galaxy install dev-sec.ssh-hardening;
The Ansible Galaxy dependencies do indeed list debops.pki
as a top level role. Again, I do remember these roles were enough to bootstrap PKI certificates throughout the cluster. I believe the path locations Ansible is using for role location are:
/root/gopher/devops/roles
/root/.ansible/roles
/usr/share/ansible/roles
/etc/ansible/roles
/root/gopher/devops
On my Mac, the roles as installed by Galaxy are here:
bash-3.2$ ls -la ~/.ansible/roles/
total 0
drwxr-xr-x 9 jjzazuet staff 288 Aug 26 17:10 .
drwx------ 5 jjzazuet staff 160 Aug 26 17:13 ..
drwxr-xr-x 16 jjzazuet staff 512 Aug 26 17:10 debops-contrib.apparmor
drwxr-xr-x 14 jjzazuet staff 448 Aug 26 17:10 debops.grub
drwxr-xr-x 14 jjzazuet staff 448 Aug 26 17:10 debops.pki
drwxr-xr-x 14 jjzazuet staff 448 Aug 26 17:10 debops.secret
drwxr-xr-x 22 jjzazuet staff 704 Aug 26 17:10 dev-sec.ssh-hardening
drwxr-xr-x 9 jjzazuet staff 288 Aug 26 17:10 esolitos.resolv
drwxr-xr-x 13 jjzazuet staff 416 Aug 26 17:10 holms.fqdn
bash-3.2$
And the contents of debops.pki
bash-3.2$ ls -la ~/.ansible/roles/debops.pki/
total 8
drwxr-xr-x 14 jjzazuet staff 448 Aug 26 17:10 .
drwxr-xr-x 9 jjzazuet staff 288 Aug 26 17:10 ..
-rw-rw-r-- 1 jjzazuet staff 803 Aug 6 07:10 COPYRIGHT
drwxr-xr-x 9 jjzazuet staff 288 Aug 26 17:10 _cacher_ng
drwxr-xr-x 7 jjzazuet staff 224 Aug 26 17:10 _install
drwxr-xr-x 7 jjzazuet staff 224 Aug 26 17:10 _listchanges
drwxr-xr-x 8 jjzazuet staff 256 Aug 26 17:10 _mark
drwxr-xr-x 7 jjzazuet staff 224 Aug 26 17:10 _preferences
drwxr-xr-x 8 jjzazuet staff 256 Aug 26 17:10 _proxy
drwxr-xr-x 29 jjzazuet staff 928 Aug 26 17:10 debops-0.8.0
drwxr-xr-x 3 jjzazuet staff 96 Aug 26 17:10 defaults
drwxr-xr-x 4 jjzazuet staff 128 Aug 26 17:10 meta
drwxr-xr-x 3 jjzazuet staff 96 Aug 26 17:10 tasks
drwxr-xr-x 4 jjzazuet staff 128 Aug 26 17:10 templates
bash-3.2$
It makes sense that Ansible wouldn't be able to find the env
subrole in that structure, so I'm just wondering if something changed in the way Galaxy is installing this role.
I'd be happy to try and migrate to the mono repo version of debops since I haven't released my production infrastructure yet.
Thanks again for the help!
That explains everything, thanks. In essence, Ansible Galaxy backend and handling of roles has changed some time ago to enable support for multi-role repositories, among other things. I played with supporting the new Galaxy a bit in the DebOps monorepo, but the current state of how ansible-galaxy
or mazer
install it doesn't look very promising. It looks like a few of the DebOps roles like debops.apt_*
are installed in a broken state, then the DebOps monorepo is included in a weird way... No idea what to do about it.
It's especially puzzling for me, because I only messed around with the DebOps monorepo in the Galaxy database, and I left the older, separate role repositories intact. No idea why, since you install specifically debops.pki
, the monorepo along with the debops.apt_*
roles gets pulled as well. Perhaps @chouseknecht would be interested about this.
For now, I would suggest that you avoid using ansible-galaxy
or mazer
to install DebOps roles and/or monorepo. I just tried installing the monorepo directly via the repository URL but ansible-galaxy
failed - although it might be due to an old version. There are a few other ways to handle the installation, you could clone the monorepo directly to ~/.local/share/debops/debops/
and add that path to the roles_path
variable, or you could install DebOps via pip install debops
, the Python package contains a snapshot of the DebOps roles at a specific tag - this might be handy if you want to stick to stable releases. Otherwise, after installing the debops
Python package you can run debops-update
to get the latest changes in the monorepo. Check the installation instructions for more details.
I'll give the pip install
path a try. Will report back when updated. Thanks!
Ok I just pip install
ed the monorepo, and I find that the following folders were correctly installed on my Mac.
local-dev00:/ jjzazuet$ ls -la ./usr/local/lib/python2.7/site-packages/debops/ansible/roles/debops.pki
total 8
drwxr-xr-x 10 jjzazuet staff 320 Aug 27 21:59 .
drwxr-xr-x 154 jjzazuet staff 4928 Aug 27 21:59 ..
-rw-r--r-- 1 jjzazuet staff 785 Aug 27 21:58 COPYRIGHT
drwxr-xr-x 3 jjzazuet staff 96 Aug 27 21:59 defaults
drwxr-xr-x 5 jjzazuet staff 160 Aug 27 21:59 env
drwxr-xr-x 4 jjzazuet staff 128 Aug 27 21:59 files
drwxr-xr-x 3 jjzazuet staff 96 Aug 27 21:59 handlers
drwxr-xr-x 3 jjzazuet staff 96 Aug 27 21:59 meta
drwxr-xr-x 6 jjzazuet staff 192 Aug 27 21:59 tasks
drwxr-xr-x 3 jjzazuet staff 96 Aug 27 21:59 templates
local-dev00:/ jjzazuet$
Should I now tell Ansible to include the role's mono-repo path via the roles_path
variable? Or should it be able to locate the mono-repo on its own?
Thanks again!
Yes, when you add /usr/local/lib/python2.7/site-packages/debops/ansible/roles/
path to roles_path
, Ansible should be able to find the roles there.
@drybjed ok so I managed to get Ansible to source the debops playbooks from the additional install path, but I now seem to be running into the same issue as https://github.com/debops/debops-tools/issues/117 :(
fatal: [ny-api00]: FAILED! => {"msg": "lookup plugin (task_src) not found"}
I also tried adding .debops.cfg
at the root of my playbook hierarchy as:
[paths]
data-home: /usr/local/lib/python2.7/dist-packages/debops
My apologies, I'm running out of ideas as to what I could be doing wrong. Any advice is appreciated. Thanks!
Ok this seems to have done the trick:
lookup_plugins=/usr/local/lib/python2.7/dist-packages/debops/ansible/playbooks/lookup_plugins
Sigh... next time I'll need to think twice before running my playbooks with a later version of Ansible. So I guess I should freeze the version at 2.6.
Thanks again for the help!
Ah, my apologies, as I've just stumbled upon a new error while running the pki role. Apparently, an intermediate step to the role fails due to some kind of network error. In this example, I have four hosts doing the PKI certificate exchange, and in subsequent runs, any other pair might fail with no apparent reason. Here's the failing step's output:
TASK [debops.pki : Upload internal certificate requests] *************************************************************************************************
failed: [ste-api02] (item={u'subject_alt_names': [u'dns:ste-api02.gopher.io', u'dns:localhost', u'ip:108.61.41.194', u'ip:10.0.0.182', u'ip:127.0.0.1'], u'name': u'gopher.io', u'acme': False, u'subject': [u'cn=node']}) => {"changed": false, "checksum": "003eb9351c451307c9716ab35cef9ae78560c3ec", "dest": "/etc/vault/./api/pki/requests/domain/gopher.io/gopher.io/request.pem", "file": "/etc/pki/realms/gopher.io/internal/request.pem", "item": {"acme": false, "name": "gopher.io", "subject": ["cn=node"], "subject_alt_names": ["dns:ste-api02.gopher.io", "dns:localhost", "ip:108.61.41.194", "ip:10.0.0.182", "ip:127.0.0.1"]}, "md5sum": "714cf8916ad9da5d0e9827ca33f6d340", "msg": "checksum mismatch", "remote_checksum": "7a7abe9289fc2813cc755ae894a68cd2ba45250a", "remote_md5sum": null}
changed: [ste-api00] => (item={u'subject_alt_names': [u'dns:ste-api00.gopher.io', u'dns:localhost', u'ip:104.243.38.42', u'ip:10.0.0.180', u'ip:127.0.0.1'], u'name': u'gopher.io', u'acme': False, u'subject': [u'cn=node']})
failed: [ste-api01] (item={u'subject_alt_names': [u'dns:ste-api01.gopher.io', u'dns:localhost', u'ip:209.222.98.74', u'ip:10.0.0.181', u'ip:127.0.0.1'], u'name': u'gopher.io', u'acme': False, u'subject': [u'cn=node']}) => {"changed": false, "checksum": "003eb9351c451307c9716ab35cef9ae78560c3ec", "dest": "/etc/vault/./api/pki/requests/domain/gopher.io/gopher.io/request.pem", "file": "/etc/pki/realms/gopher.io/internal/request.pem", "item": {"acme": false, "name": "gopher.io", "subject": ["cn=node"], "subject_alt_names": ["dns:ste-api01.gopher.io", "dns:localhost", "ip:209.222.98.74", "ip:10.0.0.181", "ip:127.0.0.1"]}, "md5sum": "714cf8916ad9da5d0e9827ca33f6d340", "msg": "checksum mismatch", "remote_checksum": "e6ec61446b5cc627ec057fd061021f5a0bf80f75", "remote_md5sum": null}
changed: [ste-bld00] => (item={u'subject_alt_names': [u'dns:ste-bld00.gopher.io', u'dns:localhost', u'ip:216.155.144.90', u'ip:10.0.0.100', u'ip:127.0.0.1'], u'name': u'gopher.io', u'acme': False, u'subject': [u'cn=node']})
Any feedback or help is appreciated. Thanks!
Hmm, I'm not sure what might the cause here. You could try by clearing up the secret/pki/requests/
directory on the Ansible Controller, see if that changes anything. If you are using a custom Ansible playbook, can you show it?
Hi. Like the title says. I was previously using the reference playbook to perform PKI certificate creation and exchange in a 4 node cluster. So I'm pretty sure my playbook used to work. I currently tried running it under both Ansible 2.4 and Ansible 2.6 on both MacOS and Debian 9 stretch. Both fail with the same error message:
And under Debian:
Upon further inspection of the role's code, it appears as if the
env
subrole is using symlinks to point back to shared parent role file resources (but I'm not fully certain). I also see that the role's codebase is a ew years old in general, so it's also possible that a new Ansible release broke the role's resource resolution strategy.Any help or feedback is appreciated. Thanks for the awesome framework! 👍