opnsense / plugins

OPNsense plugin collection
https://opnsense.org/
BSD 2-Clause "Simplified" License
832 stars 617 forks source link

security/acme-client: after 24.1.4 update >> Cloudflare - validation failed #3871

Closed opnsenseuser closed 5 months ago

opnsenseuser commented 5 months ago

Important notices Before you add a new report, we ask you kindly to acknowledge the following:

Describe the bug After the latest update OPNsense 24.1.4 i get a validation failed error. I changed nothing.

To Reproduce Steps to reproduce the behavior:

  1. Go to Services
  2. Click on ACME Client >> Certificates
  3. Switch to Certificates
  4. Last ACME Status >> validation failed

Expected behavior validation ok

Relevant log files ACME Log:

2024-03-26T16:57:16 | acme.sh | [Tue Mar 26 16:57:16 CET 2024] Please refer to https://curl.haxx.se/libcurl/c/libcurl-errors.html for error code: 2
2024-03-26T16:57:16 | acme.sh | [Tue Mar 26 16:57:16 CET 2024] See: https://github.com/acmesh-official/acme.sh/wiki/How-to-debug-acme.sh

System Log:

2024-03-26T16:57:20 | config | AcmeClient: validation for certificate failed: mydomain.net
-- | -- | --
2024-03-26T16:57:20 | config | AcmeClient: domain validation failed (dns01)
2024-03-26T16:57:20 | config | /usr/local/opnsense/scripts/OPNsense/AcmeClient/lecert.php: AcmeClient:  The shell command returned exit code '1': '/usr/local/sbin/acme.sh  --issue --syslog 6 --log-level 1 --server 'letsencrypt' --dns 'dns_cf'  --dnssleep '120' --home '/var/etc/acme-client/home' --cert-home  '/var/etc/acme-client/cert

Additional context After the latest update OPNsense 24.1.4 i get a validation failed error. Before the update it worked without any problems. i tried to uninstall acme and reinstall it - revoke it - reset it - nothing helps see forum >> https://forum.opnsense.org/index.php?topic=39669.0

Environment OPNsense 24.1.4 (amd64). Intel Atom Processor C3558 Network Intel® C3000 SoC, 1GbE

fraenki commented 5 months ago

Please go to Services: ACME Client: Settings and set Log Level to debug 3, then click the Apply button. Afterwards try again, the ACME Log should contain much more details about the failure.

(FWIW, 24.1.4 did not include any changes for os-acme-client or acme.sh.)

opnsenseuser commented 5 months ago

ACME Log: (Debug 3)

    #define WITH_SOCKS4 1
    #undef WITH_POSIXMQ
    #define WITH_LISTEN 1
    #define WITH_UDPLITE 1
    #define WITH_DCCP 1
    #define WITH_SCTP 1
    #define WITH_UDP 1
    #define WITH_TCP 1
    #undef WITH_INTERFACE
    #define WITH_DEFAULT_IPV 4
    #define WITH_MSGLEVEL 0 /*debug*/
    #define WITH_RETRY 1
    #define WITH_FILAN 1
    #define WITH_SYCLS 1
    #define WITH_LIBWRAP 1
    #undef WITH_FIPS
    #define WITH_OPENSSL 1
    #define WITH_PTY 1
    #undef WITH_TUN
    #undef WITH_READLINE
    #define WITH_EXEC 1
    #define WITH_SHELL 1
    #define WITH_SYSTEM 1
    #define WITH_PROXY 1
    #undef WITH_NAMESPACES
    #undef WITH_VSOCK
    #define WITH_SOCKS5 1
    #define WITH_SOCKS4A 1
    #define WITH_SOCKS4 1
    #define WITH_GENERICSOCKET 1
    #define WITH_RAWIP 1
    #define WITH_IP6 1
    #define WITH_IP4 1
    #undef WITH_ABSTRACT_UNIXSOCKET
    #define WITH_UNIX 1
    #define WITH_SOCKETPAIR 1
    #define WITH_PIPE 1
    #define WITH_TERMIOS 1
    #define WITH_GOPEN 1
    #define WITH_CREAT 1
    #define WITH_FILE 1
    #define WITH_FDNUM 1
    #define WITH_STDIO 1
    #define WITH_STATS 1
    #define WITH_HELP 1
    features:
    running on FreeBSD version FreeBSD 13.2-RELEASE-p10 stable/24.1-n254984-f7b006edfa8 SMP, release 13.2-RELEASE-p10, machine amd64
    socat version 1.8.0.0 on Mar 20 2024 01:07:35
    socat by Gerhard Rieger and contributors - see www.dest-unreach.org
    socat:
    nginx doesn't exist.
    nginx:
    apache doesn't exist.
    apache:
    OpenSSL 1.1.1t-freebsd 7 Feb 2023
    openssl:openssl

2024-03-28T04:48:29 acme.sh [Thu Mar 28 04:48:29 CET 2024] Diagnosis versions: 2024-03-28T04:48:29 acme.sh [Thu Mar 28 04:48:29 CET 2024] code='200' 2024-03-28T04:48:29 acme.sh [Thu Mar 28 04:48:29 CET 2024] _ret='0' 2024-03-28T04:48:28 acme.sh [Thu Mar 28 04:48:28 CET 2024] _CURL='curl --silent --dump-header /var/etc/acme-client/home/http.header -L --trace-ascii /tmp/tmp.zdIakwY4 -g '

Staticznld commented 5 months ago

Experiencing the same problem. When looking further in the logs i see this entry!

Invalid status, *.domain.com:Verify error detail:DNS problem: NXDOMAIN looking up TXT for _acme-challenge.domain.com - check that a DNS record exists for this dom

I am very sure the records are in place. In januari this year the certificates where issued without any issues!

When trying the same domains againt the staging LE server it works without any problems!

fraenki commented 5 months ago

@opnsenseuser , unfortunately, that's mostly the curl error (which is not helpful). Could you please share the full ACME Log? Feel free to cloak private information.

Maybe something in the Cloudflare API has changed, if multiple users are affected... but that's just a wild guess.

opnsenseuser commented 5 months ago

@fraenki i attached the acme.log

acme.full_log.txt

fraenki commented 5 months ago

@opnsenseuser, thanks for the log. It looks like the DNS TXT records are added and removed without any errors. However, there is an error when the CA checks the TXT records:

[Thu Mar 28 04:48:11 CET 2024] Invalid status, router.mydomain.net:Verify error detail:No TXT record found at _acme-challenge.router.mydomain.net

For unknown reason the Let's Encrypt CA was unable to find the required TXT record for your subdomain router.mydomain.net. Please manually check the DNS records for this subdomain. Maybe there is something wrong, or old ACME TXT records are still lingering – if so, please remove all of them.

You could also try to set the DNS sleep time to a higher value (it is currently set to 120 seconds) – or set it to 0 to let acme.sh determine when the DNS records are available.

opnsenseuser commented 5 months ago

@fraenki thx for your support. but i can´t find any solution. acme log says: [Wed Apr 3 06:57:39 CEST 2024] Please refer to https://curl.haxx.se/libcurl/c/libcurl-errors.html for error code: 2

and system log still says: 2024-04-03T06:57:40 opnsense AcmeClient: validation for certificate failed: mydomain.net
2024-04-03T06:57:40 opnsense AcmeClient: domain validation failed (dns01)

as i said. i didn´t change anything. only updatet from 24.1.3 to 24.1.4

fraenki commented 5 months ago

Did you change the DNS sleep time as I suggested? Try setting it to 0 (the new default value in os-acme-client) or a much higher value like 900.

acme log says: [Wed Apr 3 06:57:39 CEST 2024] Please refer to https://curl.haxx.se/libcurl/c/libcurl-errors.html for error code: 2

This is not the real issue. It's just a random curl error, but the root cause was the missing TXT record.

as i said. i didn´t change anything. only updatet from 24.1.3 to 24.1.4

Neither did I. :smile: There was no change in os-acme-client in this version. That's why I suggested that a change at Cloudflare is probably the root cause. Maybe Cloudflare is processing DNS updates much slower now. It could be that simple.

opnsenseuser commented 5 months ago

@fraenki yes i changed to 0. but no luck. i will try 900 too. Thx for your support. i will contact cloudflare.

opnsenseuser commented 5 months ago

ok, i figured out what the problem was. i had do manual create a TXT entry on cloudflare for _acme-challenge.subdomain. now it works as before. thx @fraenki