openaustralia / infrastructure

Automated setup and configuration for most of OpenAustralia Foundation's servers
8 stars 2 forks source link

letsencrypt certificates are not automatically regenerated #87

Open mlandauer opened 6 years ago

mlandauer commented 6 years ago

@jamezpolley could you please look at this pretty urgently. There's a bunch of certificates that will expire around 20th April if this doesn't get fixed.

It might be worth checking that logging for the letsencrypt cron jobs is working also.

jamezpolley commented 6 years ago

We're definitely getting logs for at least some letsencrypt jobs; going through my mail I see a variety of responses. A lot aren't due for renewal yet, but there's definitely some failures happening as well:

Additionally, the following renewal configuration files were invalid:
  /etc/letsencrypt/renewal/theyvoteforyou.org.au.conf (parsefail)
  /etc/letsencrypt/renewal/theyvoteforyou.org.au-0001.conf (parsefail)
0 renew failure(s), 2 parse failure(s)
jamezpolley commented 6 years ago

Current status, based on digging through emails to see what logs are being mailed out. Timestamps refer to the last-seen email

Update: RTK and openaustralia.org have been updated.

jamezpolley commented 6 years ago

So in short - lots of the production certs aren't sending out emails; righttoknow reports parse failures with config file, but most of the test domains seem to be being renewed

I'm not seeing emails from certbot after march 28 - and those emails came from kedumba. so it looks like certbot might not be working on the post-kedumba VMs

jamezpolley commented 6 years ago

More digging: test certs were generated on Mar 20, the same day as entries in /home are timestamped (on the RTK vm). The prod certs date from Jan 23.

I think that the issue here might be that the prod certs were generated on some other machine and then copied onto this machine; as a result, letsencrypt didn't have a chance to create the renewal config.

jamezpolley commented 6 years ago

So, this back up the idea that the certs that currently exist were copied in at the time the current VM was created. Ansible would have avoided creating new certs, so certbot didn't create a renewal config - and the renewal configs weren't copied from the old machine.

I propose that a simple "fix" for this would be to move aside the existing certs, then run Ansible. Ansible should detect that the certs are missing and create them, which should set them up for renewal.

jamezpolley commented 6 years ago

Cuttlefish - I don't have access; looking at https://github.com/mlandauer/cuttlefish/blob/master/provisioning/roles/cuttlefish-app/tasks/main.yml it seems that this uses sslmate rather than letsencrypt. I'm going to ignore this for now.

Morph - it looks as though the certs for morph were manually generated. Ansible installs a cronjob to renew them, but that cronjob won't work as it's running as the deploy user. I've submitted https://github.com/openaustralia/morph/pull/1190 to fix this.

I ran a dry-run of the renewal as root and it appears as though it should work fine. Even better, it seems that it's using the nginx plugin to do the renewal without needing to cause an outage!

So that just leaves cuttlefish..