JessThrysoee / synology-letsencrypt

91 stars 30 forks source link

unable to update / fetch certificates #19

Closed thefl0yd closed 3 days ago

thefl0yd commented 3 weeks ago

Woke up this morning to find my synology failed out of renewals and my LE cert expired. While trying to troubleshoot, it seems something broke but I don't know what. I use RFC2136 and my other RFC2136 domains work just fine (IE: my pfSense firewall).

2024/06/15 07:34:17 Could not obtain certificates: error: one or more domains had a problem: [xxxx.yyyy.zzz] propagation: time limit exceeded: last error: NS ns2.yyyy.zzz. returned REFUSED for _acme-challenge.xxxx.yyyy.zzz.

While running tcpdump on the nameservers I can see this just sitting in a loop querying for, and receiving the response to, the challenge:

11:33:05.821212 IP 1.2.3.4.47173 > 5.6.7.8.53: 56917 [1au] TXT? _acme-challenge.xxxx.yyyy.zzz. (61) 11:33:05.821429 IP 5.6.7.8.53 > 1.2.3.4.47173: 56917- 1/2/5 TXT "KvYUeOk6pqjyvrUyv05C_FW2fXhi8o3oDlgthQEm64w" (253) 11:33:09.917026 IP 1.2.3.4.52033 > 5.6.7.8.53: 55189 [1au] TXT? _acme-challenge.xxxx.yyyy.zzz. (61) 11:33:09.917238 IP 5.6.7.8.53 > 1.2.3.4.52033: 55189- 1/2/5 TXT "KvYUeOk6pqjyvrUyv05C_FW2fXhi8o3oDlgthQEm64w" (253)

It just loops forever querying and getting the (correct) response until it times out trying to tell me the nameserver returned REFUSED and then deletes the records.

Not really sure how to debug further.

thefl0yd commented 3 weeks ago

FYI suspect it's something wrong with lego here because I switched to acme.sh on my synology NAS devices and everything works without changing any other config.

JessThrysoee commented 3 weeks ago

DNS Update REFUSED suggests perhaps a tsig or DNS server policy problem? Can you update your zone manually with e.g. nsupdate from synology using the credentials you provide to lego?

The _acme-challenge you see in the dump could be an old one left from previous attempts?

thefl0yd commented 3 weeks ago

no, the zone updates (and deletes) just fine. There are no stale TXT records from before nor are there stale TXT records left behind:

15-Jun-2024 07:02:23.259 update: info: client @0x812dae960 192.168.99.30#57073/key synology-le: updating zone 'myzone.com/IN': adding an RR at '_acme-challenge.nas.myzone.com' TXT "-FXoAUhQziPErilq044e_OrhB4pQp9q_9U8lRfuHp_Q" 15-Jun-2024 07:03:26.833 update: info: client @0x80e555b60 192.168.99.30#34662/key synology-le: updating zone 'myzone.com/IN': deleting an RR at _acme-challenge.nas.myzone.com TXT 15-Jun-2024 07:07:06.607 update: info: client @0x80f065b60 192.168.99.30#37766/key synology-le: updating zone 'myzone.com/IN': deleting rrset at '_acme-challenge.fs2017.myzone.com' TXT 15-Jun-2024 07:07:06.608 update: info: client @0x80f065b60 192.168.99.30#37766/key synology-le: updating zone 'myzone.com/IN': adding an RR at '_acme-challenge.fs2017.myzone.com' TXT "XdNI9WEQskkQ9TTLrmG8mjg5EG5GyC0uxvoyZ9AtNYg" 15-Jun-2024 07:08:09.679 update: info: client @0x80ee9f360 192.168.99.30#52207/key synology-le: updating zone 'myzone.com/IN': deleting an RR at _acme-challenge.fs2017.myzone.com TXT 15-Jun-2024 07:08:09.687 update: info: client @0x80488d760 192.168.99.30#50284/key synology-le: updating zone 'myzone.com/IN': deleting rrset at '_acme-challenge.nas.myzone.com' TXT 15-Jun-2024 07:08:09.687 update: info: client @0x80488d760 192.168.99.30#50284/key synology-le: updating zone 'myzone.com/IN': adding an RR at '_acme-challenge.nas.myzone.com' TXT "hep9VvK3hPhXMvfFd2QIL-MYrc5qlIHY-UdMCc3x3D0" 15-Jun-2024 07:09:12.072 update: info: client @0x812dae960 192.168.99.30#55741/key synology-le: updating zone 'myzone.com/IN': deleting an RR at _acme-challenge.nas.myzone.com TXT 15-Jun-2024 07:10:29.270 update: info: client @0x80ee9f360 192.168.99.30#50887/key synology-le: updating zone 'myzone.com/IN': deleting rrset at '_acme-challenge.fs2017.myzone.com' TXT

Further, I watch the transfers out to my slave nameserver and was watching the tcpdump traffic on the slave. lego seems to just query in a loop until timeout, incorrectly reporting that there is a 'REFUSED' condition.

JessThrysoee commented 3 weeks ago

Does any of these fixed lego issues give inspiration? https://github.com/go-acme/lego/issues?q=REFUSED+RFC2136

Can you reproduce the problem directly with lego? https://go-acme.github.io/lego/dns/rfc2136/

thefl0yd commented 3 weeks ago

same exact problem with lego directly.

This must have broken in a synology update because I used this package to get the original certs for the NAS. Missed that they were expiring because I'd assumed the warnings were for the certs on the old unit I decom'd.

JessThrysoee commented 3 weeks ago

I would suggest you create a lego issue https://github.com/go-acme/lego/issues with as much (anonymized) info as possible.

JessThrysoee commented 3 days ago

I'll close this as a lego issue.