Closed fabricionaweb closed 1 year ago
I tried reproducing this and at first I thought I could reproduce it, but the problem resolved itself while I was looking into it. I don't know exactly what is going on here, but my hypothesis is that it's due to negative DNS caching. Every DNS resolver may cache NXDOMAIN
(DNS equivalent to 404 - not found) for a time and will answer sucessive queries with NXDOMAIN
. IIUC, the time negative queries are cached is defined by the authority section in an NXDOMAIN
response.
dig +all +multiline _acme.challenge.fabricio.dev TXT @ns1.desec.io
; <<>> DiG 9.10.6 <<>> +all +multiline _acme.challenge.fabricio.dev TXT @ns1.desec.io
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 6794
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;_acme.challenge.fabricio.dev. IN TXT
;; AUTHORITY SECTION:
fabricio.dev. 300 IN SOA get.desec.io. get.desec.io. (
2023053981 ; serial
86400 ; refresh (1 day)
3600 ; retry (1 hour)
2419200 ; expire (4 weeks)
3600 ; minimum (1 hour)
)
;; Query time: 47 msec
;; SERVER: 2607:f740:e633:deec::2#53(2607:f740:e633:deec::2)
;; WHEN: Thu May 11 21:22:24 CEST 2023
;; MSG SIZE rcvd: 105
There are two numbers here that are of interest. 300 is the TTL and 3600 is the minimum TTL. The minimum of both is the negative caching time. That is, NXDOMAIN
responses are cached for 5 minutes.
If it happens that the update via the deSEC API takes longer to propagate to DNS and validation starts before it has propagated, it's possible that a NXDOMAIN
response is generated that delays everything by 5 minutes due to negative DNS caching.
That's all just speculation though. I suggest to let it keep running for a bit longer and see if it resolves itself after a couple of minutes. Maybe that works for you too.
I need to say, really sorry but I could not understand much! hahahahah 🙈
But ok yes, I get that I need to let it running for longer, lets see.
Many thanks!
Sorry for not explaining this well. What I wanted to say is that this is probably a problem somewhere in the interaction between the various systems involved in the ACME protocol. The part that I implemented and that caddy-dns/desec is responsible for is creating the record for desec.io. IIUC, that part works.
Problems in the interaction between systems are notoriously difficult to debug. Let me know if letting it run for longer helps. If it does help, a caching issue is the likely reason you are experiencing this issue. If it does not help, I can't really help until I can reproduce the issue.
There are two more things that may help. You can specify a propagation_delay and a propagation timeout in the Caddyfile. I suggest to try the following:
tls {
dns desec {
token {$DESEC_TOKEN}
}
propagation_delay 1m
propagation_timeout 5m
}
This will make is less likely to run into the timeout you were seeing before.
I have applied the propagation value as suggested and I left it there running for hours, and it was not working, but yet I was thinking it could be cache - there or here.
And... It is working now! 🎉 I think it took 3h or so, maybe I've done many requests or something...
I really appreciate the work and the time you have spent with this little issue. Really thank you very much! It is working.
I am glad it's finally working. It definitely should not take this long though. If you are running into the issue again, I would probably try increasing the propagation timeout to 15 minutes. Maybe also increase the propagation delay.
I was using deprecated-lego with same settings. Trying to migrate now.
My Caddyfile is:
Im running with envfile, but I tried just strings, its not related. The
_acme-challenge
TXT is being update, something seems to block the read. I guess is something related to the wildcard...Without wildcard I could generate cert during the 2rd attempt when Caddy tries using zerossl. But I hope I need the wildcard for my subdomains...
The full logs with debug enable: