adferrand / dnsrobocert

Orchestrate Certbot and Lexicon together to provide Let's Encrypt TLS certificates validated by DNS challenges
https://dnsrobocert.readthedocs.io
MIT License
553 stars 90 forks source link

Auth hook fails because of missing configuration file? #730

Open JaneJeon opened 2 years ago

JaneJeon commented 2 years ago

Hi, I've been using dnsrobocert with no problem, but recently it has been failing to actually run the auth, not because of misconfiguration or incorrect DNS, but because of some... missing temp file??

Renewing an existing certificate for $site
Hook '--manual-auth-hook' for $site reported error code 1
Hook '--manual-auth-hook' for $site ran with error output:
 2022-03-26 02:31:37 fee66d534ec0 dnsrobocert.core.config[50] ERROR Configuration file /tmp/tmpa18ssoa9/dnsrobocert-runtime.yml does not exist.
 Error occured while loading the configuration file, aborting the `auth` hook.

And since the auth hook fails, the cert renewal fails... It's been working fine before this, any ideas?

Grokon commented 2 years ago

I have this error too. It can be reproduced steps:

So, we always need to restart docker container manually.

P.S. can you add docker heath check for same errors?

JaneJeon commented 2 years ago

Running into the EXACT same problem again...

Vertganti commented 2 years ago

As stated here we have been experiencing this since version 3.14.0, which fixed a different renewal issue.

As far as I can tell the problem is that the initial certonly call specifies auth, cleanup and deploy hooks in the created temporary directory using the config_path parameter. This first renewal attempt after restart therefore always succeeds. All follow-up renew calls only specify the deploy hook using the config_path parameter. The certbot renew command does not support manual execution, so the manual cleanup and auth hooks cannot be specified using parameters and will always be taken from the renewal configuration when using that command. Since the renewal file is located in the LetsEncrypt directory which is mounted outside the container, it will persist between container restarts. As @Grokon mentioned this will cause subsequent renewals to use the temporary directory path created by the very first certificate request, which does not exist anymore once the container has been restarted.

I see three possible solutions for the issue (note that I have not tested any of these):

  1. Always delete the renewal configuration when the containers are stopped (could be done manually as a workaround too)
  2. Always update the renewal configuration when a new temporary configuration directory is created (= on container start). I would actually have expected the certonly call after startup to do this, but it seems it does not?
  3. Use the same certonly command for all renewal attempts as is used for the initial requests/renewal attempt after restart. This is the official way to renew when using the manual plugin.
centja1 commented 1 year ago

I made the change described by @Vertganti to make the same certonly call on renewal as it does when the docker container starts.

I had a few certs nearing renewal and have tested it successfully, but wouldn't mind a couple more confirmations prior to submitting at PR.

I pushed a docker image to justincentanni/dnsrobocert:certonly and have the code in https://github.com/centja1/dnsrobocert/tree/call-certonly-every-time

Vertganti commented 1 year ago

Thanks @centja1 for adapting the code and providing the image. I have set it up for testing with a certificate that will expire towards the end of this month and will report if it worked then.

Vertganti commented 1 year ago

The renewal worked! Since no one else is responding I guess you can submit the PR. Hopefully @adferrand will be back and able to merge/review it soon.

Codelica commented 1 year ago

This should really be considered. On stable/production systems the problem hits regularly without something like a cron job to restart the container every so often. I think a lot of homelab users just don't notice, as the container/machine is restarted more frequently.