go-acme / lego

Let's Encrypt/ACME client and library written in Go
https://go-acme.github.io/lego/
MIT License
7.46k stars 986 forks source link

Can't get single certificate for both `DOMAIN.com` and `*.DOMAIN.com` #2068

Closed Azq2 closed 7 months ago

Azq2 commented 7 months ago

Welcome

What did you expect to see?

Single certificate with both DOMAIN.com and *.DOMAIN.com

What did you see instead?

2023/12/08 13:40:09 Could not obtain certificates:
    error: one or more domains had a problem:
[DOMAIN.com] acme: error: 403 :: urn:ietf:params:acme:error:unauthorized :: During secondary validation: Incorrect TXT record "....................removed....................." found at _acme-challenge.global-repair-management.com

How do you use lego?

Binary

Reproduction steps

CLOUDFLARE_API_KEY=.... CLOUDFLARE_EMAIL='my@email' lego --domains 'DOMAIN.COM,*.DOMAIN.COM' --accept-tos --email 'my@email' --dns cloudflare --server 'https://acme-staging-v02.api.letsencrypt.org/directory' run

Version of lego

lego version 4.14.2 linux/386

Logs

```console 2023/12/08 13:39:40 [INFO] [DOMAIN.com, *.DOMAIN.com] acme: Obtaining bundled SAN certificate 2023/12/08 13:39:41 [INFO] [*.DOMAIN.com] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/992829.... 2023/12/08 13:39:41 [INFO] [DOMAIN.com] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/9928292.... 2023/12/08 13:39:41 [INFO] [*.DOMAIN.com] acme: use dns-01 solver 2023/12/08 13:39:41 [INFO] [DOMAIN.com] acme: Could not find solver for: tls-alpn-01 2023/12/08 13:39:41 [INFO] [DOMAIN.com] acme: Could not find solver for: http-01 2023/12/08 13:39:41 [INFO] [DOMAIN.com] acme: use dns-01 solver 2023/12/08 13:39:41 [INFO] [*.DOMAIN.com] acme: Preparing to solve DNS-01 2023/12/08 13:39:43 [INFO] cloudflare: new record for DOMAIN.com, ID 10465c2f68d22366681ddc837e7d.... 2023/12/08 13:39:43 [INFO] [DOMAIN.com] acme: Preparing to solve DNS-01 2023/12/08 13:39:44 [INFO] cloudflare: new record for DOMAIN.com, ID fb5e065f065a367bd10c4a7f4cb1.... 2023/12/08 13:39:44 [INFO] [*.DOMAIN.com] acme: Trying to solve DNS-01 2023/12/08 13:39:44 [INFO] [*.DOMAIN.com] acme: Checking DNS record propagation using [127.0.0.53:53] 2023/12/08 13:39:46 [INFO] Wait for propagation [timeout: 2m0s, interval: 2s] 2023/12/08 13:39:52 [INFO] [*.DOMAIN.com] The server validated our request 2023/12/08 13:39:52 [INFO] [DOMAIN.com] acme: Trying to solve DNS-01 2023/12/08 13:39:52 [INFO] [DOMAIN.com] acme: Checking DNS record propagation using [127.0.0.53:53] 2023/12/08 13:39:54 [INFO] Wait for propagation [timeout: 2m0s, interval: 2s] 2023/12/08 13:39:54 [INFO] [DOMAIN.com] acme: Waiting for DNS record propagation. 2023/12/08 13:39:56 [INFO] [DOMAIN.com] acme: Waiting for DNS record propagation. 2023/12/08 13:39:58 [INFO] [DOMAIN.com] acme: Waiting for DNS record propagation. 2023/12/08 13:40:00 [INFO] [DOMAIN.com] acme: Waiting for DNS record propagation. 2023/12/08 13:40:02 [INFO] [DOMAIN.com] acme: Waiting for DNS record propagation. 2023/12/08 13:40:05 [INFO] [*.DOMAIN.com] acme: Cleaning DNS-01 challenge 2023/12/08 13:40:07 [INFO] [DOMAIN.com] acme: Cleaning DNS-01 challenge 2023/12/08 13:40:09 [INFO] Skipping deactivating of valid auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/9928292.... 2023/12/08 13:40:09 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/9928292.... 2023/12/08 13:40:09 Could not obtain certificates: error: one or more domains had a problem: [DOMAIN.com] acme: error: 403 :: urn:ietf:params:acme:error:unauthorized :: During secondary validation: Incorrect TXT record "Bg53EFRT7ZYcLZ_M...." found at _acme-challenge.DOMAIN.com ```

Go environment (if applicable)

ldez commented 7 months ago

Hello,

it feels like a propagation issue: clean your TXT records and try again.

Azq2 commented 7 months ago

Hello,

it feels like a propagation issue: clean your TXT records and try again.

I reproduce this issue even with clean records.

rm -rf ~/.lego # This is important
CLOUDFLARE_API_KEY=.... CLOUDFLARE_EMAIL='my@email' lego --domains 'DOMAIN.COM,*.DOMAIN.COM' --accept-tos --email 'my@email' --dns cloudflare --server 'https://acme-staging-v02.api.letsencrypt.org/directory' run

Important condition - we must don't have a cached challenge for DOMAIN.com (just rm ~/. Lego) Otherwise, all works.

I think, the main issue is that LEGO add two records at one time: one for DOMAIN.com and second for *.DOMAIN.com But both of these records are added to DOMAIN.com

ldez commented 7 months ago

I think, the main issue is that LEGO add two records at one time: one for DOMAIN.com and second for *.DOMAIN.com

This is expected: having 2 TXT records for the same domain is not a problem. The problem is the propagation of your TXT records to other DNS servers (the 2 records are not available when LE does the validation).

Azq2 commented 7 months ago

I think, the main issue is that LEGO add two records at one time: one for DOMAIN.com and second for *.DOMAIN.com

This is expected: having 2 TXT records for the same domain is not a problem. The problem is the propagation of your TXT records to other DNS servers (the 2 records are not available when LE does the validation).

But anyway this is LEGO's bug. I don't see any problem with this case on acme.sh or certbot.

Maybe, for fixing this issue, LEGO must work like this:

  1. Get challenge for DOMAIN.com and set to the DNS
  2. Validate challenge for DOMAIN.com and clean DNS records
  3. Get challenge for *.DOMAIN.com and set to the DNS
  4. Validate challenge for *.DOMAIN.com and clean DNS records
  5. Request certificate

The current behavior is:

  1. Get challenge for DOMAIN.com and set to the DNS
  2. Get challenge for *.DOMAIN.com and set to DNS
  3. Validate challenge for DOMAIN.com and clean DNS records
  4. Validate challenge for *.DOMAIN.com and clean DNS records
  5. Request certificate

But failing on 4 or 3 steps.

Azq2 commented 7 months ago

The problem is the propagation of your TXT records to other DNS servers (the 2 records are not available when LE does the validation).

Hmm, yes, with --dns-timeout 120 seems to be works. Thanks.

Azq2 commented 7 months ago

Oh no, still not working :(

$ rm -rf ~/.lego
$ CLOUDFLARE_API_KEY=secret CLOUDFLARE_EMAIL='my@email' lego --domains 'DOMAIN.com,*.DOMAIN.com' --accept-tos --email 'my@email' --dns cloudflare --server 'https://acme-staging-v02.api.letsencrypt.org/directory' --dns-timeout 300 run
2023/12/08 15:36:25 No key found for account my@email. Generating a P256 key.
2023/12/08 15:36:25 Saved key to /home/USER/.acme/accounts/acme-staging-v02.api.letsencrypt.org/my@email/keys/my@email.key
2023/12/08 15:36:26 [INFO] acme: Registering account for my@email
!!!! HEADS UP !!!!

Your account credentials have been saved in your Let's Encrypt
configuration directory at "/home/USER/.acme/accounts".

You should make a secure backup of this folder now. This
configuration directory will also contain certificates and
private keys obtained from Let's Encrypt so making regular
backups of this folder is ideal.
2023/12/08 15:36:26 [INFO] [DOMAIN.com, *.DOMAIN.com] acme: Obtaining bundled SAN certificate
2023/12/08 15:36:28 [INFO] [*.DOMAIN.com] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/............
2023/12/08 15:36:28 [INFO] [DOMAIN.com] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/............
2023/12/08 15:36:28 [INFO] [*.DOMAIN.com] acme: use dns-01 solver
2023/12/08 15:36:28 [INFO] [DOMAIN.com] acme: Could not find solver for: tls-alpn-01
2023/12/08 15:36:28 [INFO] [DOMAIN.com] acme: Could not find solver for: http-01
2023/12/08 15:36:28 [INFO] [DOMAIN.com] acme: use dns-01 solver
2023/12/08 15:36:28 [INFO] [*.DOMAIN.com] acme: Preparing to solve DNS-01
2023/12/08 15:36:30 [INFO] cloudflare: new record for DOMAIN.com, ID 93d6510283268....................
2023/12/08 15:36:30 [INFO] [DOMAIN.com] acme: Preparing to solve DNS-01
2023/12/08 15:36:31 [INFO] cloudflare: new record for DOMAIN.com, ID 8c333c92afff2....................
2023/12/08 15:36:31 [INFO] [*.DOMAIN.com] acme: Trying to solve DNS-01
2023/12/08 15:36:31 [INFO] [*.DOMAIN.com] acme: Checking DNS record propagation using [127.0.0.53:53]
2023/12/08 15:36:33 [INFO] Wait for propagation [timeout: 2m0s, interval: 2s]
2023/12/08 15:36:37 [INFO] [*.DOMAIN.com] The server validated our request
2023/12/08 15:36:37 [INFO] [DOMAIN.com] acme: Trying to solve DNS-01
2023/12/08 15:36:37 [INFO] [DOMAIN.com] acme: Checking DNS record propagation using [127.0.0.53:53]
2023/12/08 15:36:39 [INFO] Wait for propagation [timeout: 2m0s, interval: 2s]
2023/12/08 15:36:39 [INFO] [DOMAIN.com] acme: Waiting for DNS record propagation.
2023/12/08 15:36:41 [INFO] [DOMAIN.com] acme: Waiting for DNS record propagation.
2023/12/08 15:36:44 [INFO] [DOMAIN.com] acme: Waiting for DNS record propagation.
2023/12/08 15:36:46 [INFO] [DOMAIN.com] acme: Waiting for DNS record propagation.
2023/12/08 15:36:48 [INFO] [DOMAIN.com] acme: Waiting for DNS record propagation.
2023/12/08 15:36:50 [INFO] [DOMAIN.com] acme: Waiting for DNS record propagation.
2023/12/08 15:36:53 [INFO] [*.DOMAIN.com] acme: Cleaning DNS-01 challenge
2023/12/08 15:36:53 [INFO] [DOMAIN.com] acme: Cleaning DNS-01 challenge
2023/12/08 15:36:55 [INFO] Skipping deactivating of valid auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/............
2023/12/08 15:36:55 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/............
2023/12/08 15:36:55 Could not obtain certificates:
    error: one or more domains had a problem:
[DOMAIN.com] acme: error: 403 :: urn:ietf:params:acme:error:unauthorized :: Incorrect TXT record "0QLcNVIyiETrzTkK....." found at _acme-challenge.DOMAIN.com
ldez commented 7 months ago

The fact to have multiple TXT records is not a problem, this is why lego handles the DNS challenge for Cloudflare in "parallel" and not sequentially.

This provider has been widely used, for a long time, without any problem with wildcard. I use it.

The problem is a propagation issue, I don't know why, I will try to find more information but it may be related to your zone or your geographical zone.

Azq2 commented 7 months ago

I reproduce this bug in different geographical zones, different domain zones (.com and .in), different servers (hetzner vs OVH). This is not environment bug.

Seems like, LEGO use incorrect TXT in cases where more than one TXT present at the same time (because *.DOMAIN.com and DOMAIN.com adds TXT to DOMAIN.com)

Maybe, it is impossible to distinguish which TXT belongs to a specific domain in this case. But that is my guesses.

Also, requesting certificates only for DOMAIN.com or only for *.DOMAIN.com (separate certificates) works fine. But I want a single certificate with DOMAIN.com and *.DOMAIN.com.

2023/12/08 15:36:30 [INFO] cloudflare: new record for DOMAIN.com, ID 93d6510283268....................
...
2023/12/08 15:36:31 [INFO] cloudflare: new record for DOMAIN.com, ID 8c333c92afff2....................

Please, see my logs. Or you can check it yourself.

ldez commented 7 months ago

Seems like, LEGO use incorrect TXT in cases where more than one TXT present at the same time (because *.DOMAIN.com and DOMAIN.com adds TXT to DOMAIN.com)

lego uses and adds the right TXT records, the validation is not done by lego but by Let's Encrypt.

Maybe, it is impossible to distinguish which TXT belongs to a specific domain in this case.

It's not how it works. lego uses a "parallel" approach (several TXT records for the same domain) on 90% of the DNS providers without any issues. The other 10% are DNS providers that don't support multiple TXT records for a domain.

It's a propagation issue. There are several possibilities:

Azq2 commented 7 months ago

Okay, I see you are right. Sorry for the misunderstanding.

I don't see issue with this modifications.

diff --git a/providers/dns/cloudflare/cloudflare.go b/providers/dns/cloudflare/cloudflare.go
index 2d91fe4b..11709870 100644
--- a/providers/dns/cloudflare/cloudflare.go
+++ b/providers/dns/cloudflare/cloudflare.go
@@ -151,6 +151,9 @@ func (d *DNSProvider) Present(domain, token, keyAuth string) error {
        d.recordIDsMu.Unlock()

        log.Infof("cloudflare: new record for %s, ID %s", domain, response.ID)
+       log.Infof("SLEEPING 60 SECONDS")
+
+       time.Sleep(60 * time.Second)

        return nil
 }

LEGO have any --dnssleep analog?

--dns-timeout does not solve this problem.

ldez commented 7 months ago

The time.Sleep(60 * time.Second) is not the right solution:

Azq2 commented 7 months ago

The time.Sleep(60 * time.Second) is not the right solution:

  • waiting 1 minute between 2 challenges is extremely slow, this will create a huge regression when a user has to handle thousands of domains.
  • you just slow down the challenge requests, it's not related to DNS, so it will be flaky.

This is not suggestion for fix, this is how I check problem, nothing more.

But, I think, option like --dnssleep (acme.sh, minimum sleep between adding DNS records and validating in LE) should be useful for cases like this. I can wait a few seconds if it helps avoid DNS issues on LE side.

I didn't find a similar option in LEGO.

Azq2 commented 7 months ago

Hm, I found CLOUDFLARE_POLLING_INTERVAL=30 and it working

Azq2 commented 7 months ago

Thanks for help.

AleXoundOS commented 6 months ago

I experience exactly the same problem with cloudflare. @Azq2, how reliably CLOUDFLARE_POLLING_INTERVAL=30 solution works for you? And how the interval value is chosen?

Still, I think the issue is not resolved, since none of developers promoted any official solution or fix.

ldez commented 6 months ago

Still, I think the issue is not resolved, since none of developers promoted any official solution or fix.

FYI, I'm the main maintainer of lego.

The solution found by Azq2 is in the same direction as my suggestions and fixes his problem, so it becomes the "official" solution.

AleXoundOS commented 4 months ago

@ldez, it seems for deSEC DESEC_POLLING_INTERVAL=30 does not help.

ldez commented 4 months ago

This was related to Cloudflare, if you have an issue with deSEC, can you open a dedicated issue?

AleXoundOS commented 3 months ago

I'm not sure if all the options below are relevant, but this combination works reliably:

DESEC_POLLING_INTERVAL=30
DESEC_PROPAGATION_TIMEOUT=180
DESEC_TTL=3600

@ldez, currently, I'm not ready to investigate the issue further, unless it stops working.