centerclick / feedback

Issues, Bug Reports, and Feature Requests
7 stars 0 forks source link

DNS-01 update failures don't cause the ACME protocol to abort #89

Closed tlhackque closed 1 year ago

tlhackque commented 1 year ago

(Details in a PM)

If a DNS-01 update fails, nsupdate reports the failure, but the update process continues to the next step, contacting the LE servers. They don't see the TXT token record, and fail. But this uses quota at LE and annoys their servers.

You can easily demonstrate this fault by providing an incorrect TSIG key, or a hostname that the DNS update server will refuse.

The correct action is to report the cert issue failure and stop trying to renew. So long as nsupdate successfully contacts the DNS server, update failures aren't transient; retries won't help. The only exceptions are: (a) If the DNS server becomes unreachable (b) if TSIG authentication fails due to a key rotation. IP-address based authentication failures require manual configuration changes, as do update policy failures, persistent network failures, etc.

The best way to handle DNS-01 errors is to send an e-mail to the contact e-mail address with the details. Retrying once, after an hour would be a reasonable strategy that handles the DNS server down/unreachable case. More than that is pointless, as the situation calls for human attention.

You can caputre the nsupdate output to make more fine-grained decisions, e.g. update failed: REFUSED requires manual attention

while

; Communication with 192.168.53.1#53 failed: timed out
could not reach any name server

indicates a possibly transient failure (but it could also be permanent)

dave4445 commented 1 year ago

Dup of https://github.com/ndilieto/uacme/issues/45 will upgrade