JoergFiedler / freebsd-jailed-nginx

Ansible role that creates a jailed nginx server.
6 stars 1 forks source link

Lets-Encrypt Chicken/Egg issue #1

Open fasterhorsesecurity opened 7 years ago

fasterhorsesecurity commented 7 years ago

I'm hoping I'm missing something, but when running the following playbook:

- hosts: tag_Name_web ; all ec2 instances tagged with web
  gather_facts: True
  user: ec2-user

  roles:
    - { role: JoergFiedler.freebsd-jailed-nginx,
        tags: ['nginx'],
        use_ssmtp: true,
        use_syslogd_server: true,
        nginx_pf_redirect: true,
        nginx_letsencrypt_enabled: true,
        nginx_servers: [
          {
            name: 'X.com',
            aliases: 'www.X.com',
            https: true,
            https.letsencrypt_enabled: true,
            force_https: true
          }
        ],
        jail_name: 'nginx',
        jail_net_ip: '10.1.0.5',
        host_ssh_user: 'ec2-user',
        host_ioc_zpool_devices: 'xbd5',
        become: yes,
        become_method: su }

I get the following error:

TASK [JoergFiedler.freebsd-jailed-nginx : --cleanup-- Copy server certificates] [WARNING]: The loop variable 'item' is already in use. You should set the loop_var value in the loop_control option for the task to something else to avoid variable collisions and unexpected behavior.

failed: [X.X.X.X] (item={u'src': u'X.com-key.pem', u'dst': u'key.pem'}) => {"failed": true, "item": {"dst": "key.pem", "src": "X.com-key.pem"}, "msg": "Unable to find 'X.com-key.pem' in expected paths."} failed: [X.X.X.X] (item={u'src': u'X.com-certbundle.pem', u'dst': u'fullchain.pem'}) => {"failed": true, "item": {"dst": "fullchain.pem", "src": "X.com-certbundle.pem"}, "msg": "Unable to find 'X.com-certbundle.pem' in expected paths."} failed: [X.X.X.X] (item={u'src': u'X.com-dhparam.pem', u'dst': u'dhparam.pem'}) => {"failed": true, "item": {"dst": "dhparam.pem", "src": "X.com-dhparam.pem"}, "msg": "Unable to find 'X.com-dhparam.pem' in expected paths."}

From what I can tell, the certs aren't created until the periodic cron job runs the acme script, which doesn't even get the domains into the file until after the "--cleanup-- Copy server certificates" task is successful.

Am I missing something, or do I need to manually create some starter self-signed certs and get them into those locations? I don't see where this would have happened in any of the tasks to this point.

Thanks!

JoergFiedler commented 7 years ago

Hi @fasterhorsesecurity … thanks for getting back to me … all my roles are kind a work in progress … pls also note that this roles requires Ansible 2.0.1.0 … most recent versions do not iterate over the list of configured servers …

From what I can tell, the certs aren't created until the periodic cron job runs the acme script, which doesn't even get the domains into the file until after the "--cleanup-- Copy server certificates" task is successful.

This is definitely true … I am already working on it and as you said it is kind a chicken egg problem … providing a self signed certificate at start is the solution I am working on right now … this would allow nginx to start … there will also be an option to generate certificates using Let's Encrypt on first boot, but this also requires the domain name to be routed already … not perfect … any ideas? …

JoergFiedler commented 7 years ago

@fasterhorsesecurity the task --cleanup-- Copy server certificates is currently broken … for me there is no need to keep it as the role won't be used with provided certificates anymore … I will remove the task and try to create a machine using your playbook …

fasterhorsesecurity commented 7 years ago

Mostly, I was hoping to make sure I wasn't missing anything; I'm brand new to Ansible (though an old hand with FreeBSD/etc) and thought there might be things I just couldn't see/didn't know.

For me, because I'm using systems in EC2, I'm using a domain registrar who supports dynamic DNS, and I have an earlier playbook where I set up the base host with ddclient, so the domain is routed for me as part of the set of playbooks I'm creating. So in my case, modifying the plays to just go ahead and engage the acme-client with a simple nginx setup prior to the final https nginx setup should work. That might be more complicated than necessary; I might go with certbot instead of acme-client, which can run as a webserver long enough to get the cert, and then start nginx after. Certbot also, hypothetically, can install the certs for nginx itself, though I haven't played with that at all so I have no idea how well it works, or how I'd then set up the other configuration for nginx.

I can also see having a self-signed cert as you suggest to allow nginx to boot, and then using it to provide the acme challenge, and then swapping out the certs. That one would certainly be more friendly if you didn't have a dynamic dns already set up as part of the process.

Thank you so much for your answer; there's no need to do any more work on my behalf! I just wanted to make sure my understanding was correct before I started working on a solution.

JoergFiedler commented 7 years ago

@fasterhorsesecurity I am about to push a branch lets-encrypt-and-chicken-egg-problem which provides a partial solution to the issue … you need to run the /usr/local/bin/acme-client-weekly.sh once manually for now, after the jail has been started … if the domain setup is correct, you should have valid certificates within seconds … this is something I will automate soon as well, but not today …

I use a self-signed certificate localhost to provide an initial configuration to nginx … the certificate will be overidden when the certificate is retrieved from Let's Encrypt … one more commit to fix the private key handling (for now only the private key from the self-signed certificate is used) …

this whole side project begun as a test, how I could manage my servers with ansible … but it crew a lot and a can not keep up all the things … sorry for the uncomplete documentation …

One more note:

Instead of:

https: true,
https.letsencrypt_enabled: true

please use:

https: { letsencrypt_enabled: true }

BTW: I appreciate comments/questions which make me escape my knowledge bubble …

fasterhorsesecurity commented 7 years ago

I will absolutely try your branch (and change those variables) when I can get a minute; this weekend is busy, so it might be a couple of days but I'll let you know how things go, and any changes that I find I have to make.

Don't apologise at all for anything; before I stumbled onto these roles, I was thinking that to control jails I would have to install ansible on the remote EC2 host and use the jail connector, so I'd have ansible playbooks that ran ansible playbooks. This is MUCH easier and more clever! I'd never played with iocage, and it makes things easier than with ez-jail which is what I've used in the past. Also, for some reason I didn't think at all of making the configuration changes to the absolute path from the host's point of view. So even with some issues, you've saved me a lot of time and effort, and have taught me a lot as well.

So thanks!

fasterhorsesecurity commented 7 years ago

An update:

Everything works well! There's one other variable I had to add: https: { letsencrypt_enabled: true, enabled: true } so that the https configuration was kicked off.

One thing I'm going to change is to just force the acme scripts to run at the end instead of waiting for periodic to run every week. Maybe check to see if the localhost cert is there first, and only run it if it is to make sure it's idempotent.

Thank you! You've been a ton of help, and this is an awesome project.

JoergFiedler commented 7 years ago

@fasterhorsesecurity thank you so much for the encouraging feedback … yeah, sry I missed to mention the variable … I introduced it after I wrote the message …