caddyserver / caddy

Fast and extensible multi-platform HTTP/1-2-3 web server with automatic HTTPS
https://caddyserver.com
Apache License 2.0
58.12k stars 4.03k forks source link

DNS-01 incomplete URL (in DigitalOcean DNS challenge) (xcaddy) #3664

Closed scoobybejesus closed 4 years ago

scoobybejesus commented 4 years ago

1. Environment

1a. Operating system and version

FreeBSD.  `freebsd-version` yields `12.1-RELEASE-p2`.

In a 12.1-RELEASE jail.  Also `12.1-RELEASE-p2`

1b. Caddy version (run caddy version or paste commit SHA)

v2.1.1 h1:X9k1+ehZPYYrSqBvf/ocUgdLSRIuiNiMo7CvyGUQKeA=

1c. Go version (if building Caddy from source; run go version)

go version go1.14.7 freebsd/amd64

2. Description

2a. What happens (briefly explain what is wrong)

In short, the DNS-01 validation failed (yesterday) because an extra . is on the end of the FQDN and apparently DigitalOcean's API can't intelligently drop it.

Before today, in caddy.log, [mydomain.tld] [mydomain.tld] acme: error presenting token: POST https://api.digitalocean.com/v2/domains/mydomain.tld./records: 404 The resource you were accessing could not be found.

Today, after the fixes, [mydomain.tld] [mydomain.tld] acme: error presenting token: POST https://api.digitalocean.com/v2/domains/tld/records: 404 The resource you were accessing could not be found.

After the fixes, the failure now is that mydomain. was dropped.

In case it's relevant, after commit f11d70f (libdns/digitalocean), but before commit 6517b24 (caddy-dns/digitalocean), I ran a fresh ./xcaddy build --with github.com/caddy-dns/digitalocean, and the URL went ...v2/domains/tld./records (still a trailing . but missing mydomain.).

2b. Why it's a bug (if it's not obvious)

The URL sent to DigitalOcean is invalid according to the DigitalOcean API, so you get a 404 error and the DNS challenge fails.

2c. Log output

# ./caddy start --config /usr/local/etc/caddy/Caddyfile
2020/08/17 23:42:01.588 INFO    using provided configuration    {"config_file": "/usr/local/etc/caddy/Caddyfile", "config_adapter": ""}
2020/08/17 23:42:01.590 INFO    admin   admin endpoint started  {"address": "tcp/localhost:2019", "enforce_origin": false, "origins": ["localhost:2019", "[::1]:2019", "127.0.0.1:2019"]}
2020/08/17 23:42:01 [INFO][cache:0xc0006e4360] Started certificate maintenance routine
2020/08/17 23:42:01.591 INFO    http    server is listening only on the HTTPS port but has no TLS connection policies; adding one to enable TLS {"server_name": "srv0", "https_port": 443}
2020/08/17 23:42:01.591 INFO    http    enabling automatic HTTP->HTTPS redirects        {"server_name": "srv0"}
2020/08/17 23:42:01.592 INFO    tls     cleaned up storage units
2020/08/17 23:42:01.592 INFO    http    enabling automatic TLS certificate management   {"domains": ["mydomain.tld", "www.mydomain.tld"]}
2020/08/17 23:42:01.592 INFO    autosaved config        {"file": "/root/.config/caddy/autosave.json"}
2020/08/17 23:42:01.592 INFO    serving initial configuration
Successfully started Caddy (pid=56931) - Caddy is running in the background
2020/08/17 23:42:01 [INFO][www.mydomain.tld] Obtain certificate; acquiring lock...
2020/08/17 23:42:01 [INFO][mydomain.tld] Obtain certificate; acquiring lock...
2020/08/17 23:42:01 [INFO][www.mydomain.tld] Obtain: Lock acquired; proceeding...
2020/08/17 23:42:01 [INFO][mydomain.tld] Obtain: Lock acquired; proceeding...
2020/08/17 23:42:01 [INFO] acme: Registering account for myemail@example.com
2020/08/17 23:42:02 [INFO][www.mydomain.tld] Waiting on rate limiter...
2020/08/17 23:42:02 [INFO][www.mydomain.tld] Done waiting
2020/08/17 23:42:02 [INFO] [www.mydomain.tld] acme: Obtaining bundled SAN certificate given a CSR
2020/08/17 23:42:02 [INFO][mydomain.tld] Waiting on rate limiter...
2020/08/17 23:42:02 [INFO][mydomain.tld] Done waiting
2020/08/17 23:42:02 [INFO] [mydomain.tld] acme: Obtaining bundled SAN certificate given a CSR
2020/08/17 23:42:02 [INFO] [www.mydomain.tld] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/95889430
2020/08/17 23:42:02 [INFO] [www.mydomain.tld] acme: Could not find solver for: tls-alpn-01
2020/08/17 23:42:02 [INFO] [www.mydomain.tld] acme: Could not find solver for: http-01
2020/08/17 23:42:02 [INFO] [www.mydomain.tld] acme: use dns-01 solver
2020/08/17 23:42:02 [INFO] [www.mydomain.tld] acme: Preparing to solve DNS-01
2020/08/17 23:42:02 [INFO] [mydomain.tld] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/95889432
2020/08/17 23:42:02 [INFO] [mydomain.tld] acme: Could not find solver for: tls-alpn-01
2020/08/17 23:42:02 [INFO] [mydomain.tld] acme: Could not find solver for: http-01
2020/08/17 23:42:02 [INFO] [mydomain.tld] acme: use dns-01 solver
2020/08/17 23:42:02 [INFO] [mydomain.tld] acme: Preparing to solve DNS-01
2020/08/17 23:42:02 [INFO] [www.mydomain.tld] acme: Cleaning DNS-01 challenge
2020/08/17 23:42:02 [WARN] [www.mydomain.tld] acme: cleaning up failed: no memory of presenting a DNS record for www.mydomain.tld
2020/08/17 23:42:02 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/95889430
2020/08/17 23:42:03 [ERROR] error: one or more domains had a problem:
[www.mydomain.tld] [www.mydomain.tld] acme: error presenting token: POST https://api.digitalocean.com/v2/domains/tld/records: 404 The resource you were accessing could not be found.
 (challenge=dns-01 remaining=[])
2020/08/17 23:42:03 [INFO] [mydomain.tld] acme: Cleaning DNS-01 challenge
2020/08/17 23:42:03 [WARN] [mydomain.tld] acme: cleaning up failed: no memory of presenting a DNS record for mydomain.tld
2020/08/17 23:42:03 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/95889432
2020/08/17 23:42:03 [ERROR] error: one or more domains had a problem:
[mydomain.tld] [mydomain.tld] acme: error presenting token: POST https://api.digitalocean.com/v2/domains/tld/records: 404 The resource you were accessing could not be found.
 (challenge=dns-01 remaining=[])
2020/08/17 23:42:05 [ERROR] attempt 1: [www.mydomain.tld] Obtain: [www.mydomain.tld] error: one or more domains had a problem:
[www.mydomain.tld] [www.mydomain.tld] acme: error presenting token: POST https://api.digitalocean.com/v2/domains/tld/records: 404 The resource you were accessing could not be found.
 - retrying in 1m0s (3.432954126s/720h0m0s elapsed)...
2020/08/17 23:42:05 [ERROR] attempt 1: [mydomain.tld] Obtain: [mydomain.tld] error: one or more domains had a problem:
[mydomain.tld] [mydomain.tld] acme: error presenting token: POST https://api.digitalocean.com/v2/domains/tld/records: 404 The resource you were accessing could not be found.
 - retrying in 1m0s (3.72559253s/720h0m0s elapsed)...

I sed'd it out, but it's sb-net.xyz, if that matters, even though I think it doesn't.

2d. Workaround(s)

None found.

2e. Relevant links

See https://github.com/libdns/digitalocean/issues/1 and https://github.com/caddy-dns/digitalocean/issues/1.

3. Tutorial (minimal steps to reproduce the bug)

In short,

  1. pkg install go-1.14.7
  2. mkdir for xcaddy and cd into it
  3. fetch https://github.com/caddyserver/xcaddy/releases/download/v0.1.5/xcaddy_0.1.5_freebsd_amd64.tar.gz
  4. tar -xzf it
  5. ./xcaddy build --with github.com/caddy-dns/digitalocean
  6. Run /.caddy start --config <file> w/ the Caddyfile below. (The behavior is the same if running as a service)
{
    email myemail@example.com
    acme_ca https://acme-staging-v02.api.letsencrypt.org/directory 
}

mydomain.tld, www.mydomain.tld {

    root * /usr/local/www/html/
    file_server browse
    tls  {
        dns digitalocean {env.DO_AUTH_TOKEN}
    }
}

.. though I pasted my actual token string instead of setting an env var.

Note that go-1.15 won't work due to:

panic: qtls.ClientSessionState not compatible with tls.ClientSessionState

goroutine 1 [running]:
github.com/lucas-clemente/quic-go/internal/handshake.init.2()
    github.com/lucas-clemente/quic-go@v0.17.1/internal/handshake/unsafe.go:26 +0x205
mholt commented 4 years ago

What is the actual domain name? We'll need that in order to investigate fully. This is DNS code, after all.

scoobybejesus commented 4 years ago

It's there hiding at the bottom of the log. "I sed'd it out, but it's sb-net.xyz, if that matters, even though I think it doesn't." 👍

mholt commented 4 years ago

Oh, thanks. I am on my phone, will look more closely when I'm back at my computer!

mholt commented 4 years ago

@scoobybejesus I think I got the "no memory of presenting a DNS record" errors fixed here, if you'd like to try that out for starters: https://github.com/caddyserver/caddy/issues/3474#issuecomment-675754379

I do want to clarify, since you seem to be confused here, the trailing dot and the dropping of the domain label are completely separate problems, and the trailing dot problem has been fixed, so let's not conflate those two issues.

Can you please provide the actual log output, rather than replacing parts of it? I need to see the full logs and error messages without modifications.

scoobybejesus commented 4 years ago

@mholt Sorry about the possible conflation. I thought I was being helpful. Before the two noted commits, the domain label was not being dropped. I guess it was merely coincidental.

Back to the logs. I did two things (in sister, identical jails).

In the first jail, here's the log after a plain ./xcaddy build --with github.com/caddy-dns/digitalocean which [hopefully] predictably looks the same (it still has the "no memory of presenting DNS record").

{"level":"info","ts":1597805865.635958,"msg":"using provided configuration","config_file":"/usr/local/etc/caddy/Caddyfile","config_adapter":"caddyfile"}
{"level":"info","ts":1597805865.637337,"logger":"admin","msg":"admin endpoint started","address":"tcp/localhost:2019","enforce_origin":false,"origins":["127.0.0.1:2019","localhost:2019","[::1]:2019"]}
{"level":"info","ts":1597805865.6377313,"logger":"http","msg":"server is listening only on the HTTPS port but has no TLS connection policies; adding one to enable TLS","server_name":"srv0","https_port":443}
{"level":"info","ts":1597805865.637779,"logger":"http","msg":"enabling automatic HTTP->HTTPS redirects","server_name":"srv0"}
2020/08/19 02:57:45 [INFO][cache:0xc0008765a0] Started certificate maintenance routine
{"level":"info","ts":1597805865.6397943,"logger":"tls","msg":"cleaned up storage units"}
{"level":"info","ts":1597805865.639865,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["sb-net.xyz","www.sb-net.xyz"]}
{"level":"info","ts":1597805865.640006,"msg":"autosaved config","file":"/.config/caddy/autosave.json"}
{"level":"info","ts":1597805865.640018,"msg":"serving initial configuration"}
Successfully started Caddy (pid=91838) - Caddy is running in the background
2020/08/19 02:57:45 [INFO][www.sb-net.xyz] Obtain certificate; acquiring lock...
2020/08/19 02:57:45 [INFO][sb-net.xyz] Obtain certificate; acquiring lock...
2020/08/19 02:57:45 [INFO][www.sb-net.xyz] Obtain: Lock acquired; proceeding...
2020/08/19 02:57:45 [INFO][sb-net.xyz] Obtain: Lock acquired; proceeding...
2020/08/19 02:57:46 [INFO][sb-net.xyz] Waiting on rate limiter...
2020/08/19 02:57:46 [INFO][sb-net.xyz] Done waiting
2020/08/19 02:57:46 [INFO] [sb-net.xyz] acme: Obtaining bundled SAN certificate given a CSR
2020/08/19 02:57:46 [INFO][www.sb-net.xyz] Waiting on rate limiter...
2020/08/19 02:57:46 [INFO][www.sb-net.xyz] Done waiting
2020/08/19 02:57:46 [INFO] [www.sb-net.xyz] acme: Obtaining bundled SAN certificate given a CSR
2020/08/19 02:57:46 [INFO] [sb-net.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/96480919
2020/08/19 02:57:46 [INFO] [sb-net.xyz] acme: Could not find solver for: tls-alpn-01
2020/08/19 02:57:46 [INFO] [sb-net.xyz] acme: Could not find solver for: http-01
2020/08/19 02:57:46 [INFO] [sb-net.xyz] acme: use dns-01 solver
2020/08/19 02:57:46 [INFO] [sb-net.xyz] acme: Preparing to solve DNS-01
2020/08/19 02:57:46 [INFO] [www.sb-net.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/96480920
2020/08/19 02:57:46 [INFO] [www.sb-net.xyz] acme: Could not find solver for: tls-alpn-01
2020/08/19 02:57:46 [INFO] [www.sb-net.xyz] acme: Could not find solver for: http-01
2020/08/19 02:57:46 [INFO] [www.sb-net.xyz] acme: use dns-01 solver
2020/08/19 02:57:46 [INFO] [www.sb-net.xyz] acme: Preparing to solve DNS-01
2020/08/19 02:57:47 [INFO] [sb-net.xyz] acme: Cleaning DNS-01 challenge
2020/08/19 02:57:47 [WARN] [sb-net.xyz] acme: cleaning up failed: no memory of presenting a DNS record for sb-net.xyz
2020/08/19 02:57:47 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/96480919
2020/08/19 02:57:47 [ERROR] error: one or more domains had a problem:
[sb-net.xyz] [sb-net.xyz] acme: error presenting token: POST https://api.digitalocean.com/v2/domains/xyz/records: 404 The resource you were accessing could not be found.
 (challenge=dns-01 remaining=[])
2020/08/19 02:57:47 [INFO] [www.sb-net.xyz] acme: Cleaning DNS-01 challenge
2020/08/19 02:57:47 [WARN] [www.sb-net.xyz] acme: cleaning up failed: no memory of presenting a DNS record for www.sb-net.xyz
2020/08/19 02:57:47 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/96480920
2020/08/19 02:57:48 [ERROR] error: one or more domains had a problem:
[www.sb-net.xyz] [www.sb-net.xyz] acme: error presenting token: POST https://api.digitalocean.com/v2/domains/xyz/records: 404 The resource you were accessing could not be found.
 (challenge=dns-01 remaining=[])
2020/08/19 02:57:49 [ERROR] attempt 1: [sb-net.xyz] Obtain: [sb-net.xyz] error: one or more domains had a problem:
[sb-net.xyz] [sb-net.xyz] acme: error presenting token: POST https://api.digitalocean.com/v2/domains/xyz/records: 404 The resource you were accessing could not be found.
 - retrying in 1m0s (4.041527479s/720h0m0s elapsed)...
2020/08/19 02:57:50 [ERROR] attempt 1: [www.sb-net.xyz] Obtain: [www.sb-net.xyz] error: one or more domains had a problem:
[www.sb-net.xyz] [www.sb-net.xyz] acme: error presenting token: POST https://api.digitalocean.com/v2/domains/xyz/records: 404 The resource you were accessing could not be found.
 - retrying in 1m0s (4.386535654s/720h0m0s elapsed)...
{"level":"info","ts":1597805880.964957,"msg":"shutting down apps then terminating","signal":"SIGTERM"}
2020/08/19 02:58:00 [INFO][cache:0xc0008765a0] Stopped certificate maintenance routine
2020/08/19 02:58:00 [INFO][sb-net.xyz] Obtain: Releasing lock
2020/08/19 02:58:00 [INFO][www.sb-net.xyz] Obtain: Releasing lock
2020/08/19 02:58:00 [ERROR] www.sb-net.xyz: obtaining certificate: context canceled
2020/08/19 02:58:00 [ERROR] sb-net.xyz: obtaining certificate: context canceled
{"level":"info","ts":1597805880.9660518,"logger":"admin","msg":"stopped previous server"}
{"level":"info","ts":1597805880.966107,"msg":"shutdown done","signal":"SIGTERM"}

In the second jail, here's the log after after ./xcaddy build c94f5bb7dd3c98d6573c44f06d99c7252911a9fa --with github.com/caddyserver/certmagic@c2f851df75f0c52144a91e00ac53110034865faf --with github.com/caddy-dns/digitalocean (per 3474's comment). Progress!

{"level":"info","ts":1597804773.5798905,"msg":"using provided configuration","config_file":"/usr/local/etc/caddy/Caddyfile","config_adapter":"caddyfile"}
{"level":"info","ts":1597804773.5879881,"logger":"admin","msg":"admin endpoint started","address":"tcp/localhost:2019","enforce_origin":false,"origins":["localhost:2019","[::1]:2019","127.0.0.1:2019"]}
{"level":"info","ts":1597804773.5885437,"logger":"tls.cache.maintenance","msg":"started background certificate maintenance","cache":"0xc0002e41c0"}
{"level":"info","ts":1597804773.589483,"logger":"http","msg":"server is listening only on the HTTPS port but has no TLS connection policies; adding one to enable TLS","server_name":"srv0","https_port":443}
{"level":"info","ts":1597804773.5895877,"logger":"http","msg":"enabling automatic HTTP->HTTPS redirects","server_name":"srv0"}
{"level":"info","ts":1597804773.5935764,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["sb-net.xyz","www.sb-net.xyz"]}
{"level":"info","ts":1597804773.593999,"logger":"tls","msg":"cleaned up storage units"}
{"level":"info","ts":1597804773.5944622,"msg":"autosaved config","file":"/.config/caddy/autosave.json"}
{"level":"info","ts":1597804773.5945394,"msg":"serving initial configuration"}
Successfully started Caddy (pid=81807) - Caddy is running in the background
{"level":"info","ts":1597804773.595494,"logger":"tls.obtain","msg":"acquiring lock","identifier":"www.sb-net.xyz"}
{"level":"info","ts":1597804773.5955389,"logger":"tls.obtain","msg":"acquiring lock","identifier":"sb-net.xyz"}
{"level":"info","ts":1597804773.5975811,"logger":"tls.obtain","msg":"lock acquired","identifier":"sb-net.xyz"}
{"level":"info","ts":1597804773.597668,"logger":"tls.obtain","msg":"lock acquired","identifier":"www.sb-net.xyz"}
{"level":"info","ts":1597804773.6732662,"logger":"tls.issuance.acme","msg":"waiting on internal rate limiter","identifiers":["www.sb-net.xyz"]}
{"level":"info","ts":1597804773.67342,"logger":"tls.issuance.acme","msg":"done waiting on internal rate limiter","identifiers":["www.sb-net.xyz"]}
{"level":"info","ts":1597804773.6762342,"logger":"tls.issuance.acme","msg":"waiting on internal rate limiter","identifiers":["sb-net.xyz"]}
{"level":"info","ts":1597804773.6763036,"logger":"tls.issuance.acme","msg":"done waiting on internal rate limiter","identifiers":["sb-net.xyz"]}
{"level":"info","ts":1597804774.3198066,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"www.sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"info","ts":1597804774.5442126,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1597804774.6439176,"logger":"tls.issuance.acme.acme_client","msg":"cleaning up solver","identifier":"www.sb-net.xyz","challenge_type":"dns-01","error":"no memory of presenting a DNS record for www.sb-net.xyz (probably OK if presenting failed)"}
{"level":"error","ts":1597804774.7463481,"logger":"tls.obtain","msg":"will retry","error":"[www.sb-net.xyz] Obtain: [www.sb-net.xyz] solving challenges: presenting for challenge: adding temporary record for zone xyz.: POST https://api.digitalocean.com/v2/domains/xyz/records: 404 The resource you were accessing could not be found. (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15206064/133858119) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":1,"retrying_in":60,"elapsed":1.14853073,"max_duration":2592000}
{"level":"error","ts":1597804774.9239054,"logger":"tls.issuance.acme.acme_client","msg":"cleaning up solver","identifier":"sb-net.xyz","challenge_type":"dns-01","error":"no memory of presenting a DNS record for sb-net.xyz (probably OK if presenting failed)"}
{"level":"error","ts":1597804775.0358083,"logger":"tls.obtain","msg":"will retry","error":"[sb-net.xyz] Obtain: [sb-net.xyz] solving challenges: presenting for challenge: adding temporary record for zone xyz.: POST https://api.digitalocean.com/v2/domains/xyz/records: 404 The resource you were accessing could not be found. (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15206064/133858122) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":1,"retrying_in":60,"elapsed":1.43814337,"max_duration":2592000}
{"level":"info","ts":1597804835.3345578,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"www.sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"info","ts":1597804835.6195183,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1597804835.7044969,"logger":"tls.issuance.acme.acme_client","msg":"cleaning up solver","identifier":"www.sb-net.xyz","challenge_type":"dns-01","error":"no memory of presenting a DNS record for www.sb-net.xyz (probably OK if presenting failed)"}
{"level":"error","ts":1597804835.8210442,"logger":"tls.obtain","msg":"will retry","error":"[www.sb-net.xyz] Obtain: [www.sb-net.xyz] solving challenges: presenting for challenge: adding temporary record for zone xyz.: POST https://api.digitalocean.com/v2/domains/xyz/records: 404 The resource you were accessing could not be found. (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15206064/133858439) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":2,"retrying_in":120,"elapsed":62.223225214,"max_duration":2592000}
{"level":"error","ts":1597804835.9815218,"logger":"tls.issuance.acme.acme_client","msg":"cleaning up solver","identifier":"sb-net.xyz","challenge_type":"dns-01","error":"no memory of presenting a DNS record for sb-net.xyz (probably OK if presenting failed)"}
{"level":"error","ts":1597804836.0951936,"logger":"tls.obtain","msg":"will retry","error":"[sb-net.xyz] Obtain: [sb-net.xyz] solving challenges: presenting for challenge: adding temporary record for zone xyz.: POST https://api.digitalocean.com/v2/domains/xyz/records: 404 The resource you were accessing could not be found. (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15206064/133858440) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":2,"retrying_in":120,"elapsed":62.497527013,"max_duration":2592000}
{"level":"info","ts":1597804845.961294,"msg":"shutting down apps then terminating","signal":"SIGTERM"}
{"level":"info","ts":1597804845.9618795,"logger":"tls.obtain","msg":"releasing lock","identifier":"sb-net.xyz"}
{"level":"info","ts":1597804845.961898,"logger":"tls.cache.maintenance","msg":"stopped background certificate maintenance","cache":"0xc0002e41c0"}
{"level":"info","ts":1597804845.9619083,"logger":"tls.obtain","msg":"releasing lock","identifier":"www.sb-net.xyz"}
{"level":"error","ts":1597804845.9633055,"logger":"tls","msg":"job failed","error":"sb-net.xyz: obtaining certificate: context canceled"}
{"level":"error","ts":1597804845.9635413,"logger":"tls","msg":"job failed","error":"www.sb-net.xyz: obtaining certificate: context canceled"}
{"level":"info","ts":1597804845.9636319,"logger":"admin","msg":"stopped previous server"}
{"level":"info","ts":1597804845.963849,"msg":"shutdown done","signal":"SIGTERM"}
mholt commented 4 years ago

Thanks for the added info and for your patience with this!

It's getting late here and I will look at this more this week. Do you have split horizon DNS configured, perchance?

scoobybejesus commented 4 years ago

Take your time! I'm very happy to have your attention, and I only hope I'm able to give you the information you need.

To answer your question, (assuming we're talking about the same thing) yes! Good catch! I use pfSense as my firewall and router, and I set a DNS Resolver host override to point sb-net.xyz (and www.sb-net.xyz) at the internal IP of a little FreeBSD box inside my LAN. I want to host local services on it over https. I would prefer not to expose the FreeBSD box to the internet via port forwarding for LE certs. It seemed like a perfect use case for the DNS-01 challenge!

For fun, I just deleted the split DNS host override. And then I re-ran caddy. Closer! (No more 404 from a missing domain, but now an io timeout.)

{"level":"info","ts":1597810018.8182454,"msg":"using provided configuration","config_file":"/usr/local/etc/caddy/Caddyfile","config_adapter":"caddyfile"}
{"level":"info","ts":1597810018.8229609,"logger":"admin","msg":"admin endpoint started","address":"tcp/localhost:2019","enforce_origin":false,"origins":["localhost:2019","[::1]:2019","127.0.0.1:2019"]}
{"level":"info","ts":1597810018.8234234,"logger":"tls.cache.maintenance","msg":"started background certificate maintenance","cache":"0xc0006ff6c0"}
{"level":"info","ts":1597810018.824056,"logger":"http","msg":"server is listening only on the HTTPS port but has no TLS connection policies; adding one to enable TLS","server_name":"srv0","https_port":443}
{"level":"info","ts":1597810018.8242283,"logger":"http","msg":"enabling automatic HTTP->HTTPS redirects","server_name":"srv0"}
{"level":"info","ts":1597810018.828135,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["sb-net.xyz","www.sb-net.xyz"]}
{"level":"info","ts":1597810018.8288124,"logger":"tls","msg":"cleaned up storage units"}
{"level":"info","ts":1597810018.828958,"msg":"autosaved config","file":"/.config/caddy/autosave.json"}
{"level":"info","ts":1597810018.8291764,"msg":"serving initial configuration"}
Successfully started Caddy (pid=34808) - Caddy is running in the background
{"level":"info","ts":1597810018.8305645,"logger":"tls.obtain","msg":"acquiring lock","identifier":"www.sb-net.xyz"}
{"level":"info","ts":1597810018.830473,"logger":"tls.obtain","msg":"acquiring lock","identifier":"sb-net.xyz"}
{"level":"info","ts":1597810018.832194,"logger":"tls.obtain","msg":"lock acquired","identifier":"www.sb-net.xyz"}
{"level":"info","ts":1597810018.8322148,"logger":"tls.obtain","msg":"lock acquired","identifier":"sb-net.xyz"}
{"level":"info","ts":1597810018.897929,"logger":"tls.issuance.acme","msg":"waiting on internal rate limiter","identifiers":["sb-net.xyz"]}
{"level":"info","ts":1597810018.8980858,"logger":"tls.issuance.acme","msg":"done waiting on internal rate limiter","identifiers":["sb-net.xyz"]}
{"level":"info","ts":1597810018.8992944,"logger":"tls.issuance.acme","msg":"waiting on internal rate limiter","identifiers":["www.sb-net.xyz"]}
{"level":"info","ts":1597810018.8994381,"logger":"tls.issuance.acme","msg":"done waiting on internal rate limiter","identifiers":["www.sb-net.xyz"]}
{"level":"info","ts":1597810019.696299,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"info","ts":1597810019.9328573,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"www.sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1597810043.586254,"logger":"tls.obtain","msg":"will retry","error":"[sb-net.xyz] Obtain: [sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: checking DNS propagation of _acme-challenge.sb-net.xyz: dial tcp 173.245.58.51:53: i/o timeout (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15206064/133888811) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":1,"retrying_in":60,"elapsed":24.753719902,"max_duration":2592000}
{"level":"error","ts":1597810043.9102976,"logger":"tls.obtain","msg":"will retry","error":"[www.sb-net.xyz] Obtain: [www.sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: checking DNS propagation of _acme-challenge.www.sb-net.xyz: dial tcp 173.245.58.51:53: i/o timeout (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15206064/133888812) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":1,"retrying_in":60,"elapsed":25.077942833,"max_duration":2592000}
{"level":"info","ts":1597810104.134622,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"info","ts":1597810104.4734704,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"www.sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1597810126.9134212,"logger":"tls.obtain","msg":"will retry","error":"[sb-net.xyz] Obtain: [sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: checking DNS propagation of _acme-challenge.sb-net.xyz: dial tcp 173.245.58.51:53: i/o timeout (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15206064/133889336) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":2,"retrying_in":120,"elapsed":108.08088734,"max_duration":2592000}
{"level":"error","ts":1597810127.2575293,"logger":"tls.obtain","msg":"will retry","error":"[www.sb-net.xyz] Obtain: [www.sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: checking DNS propagation of _acme-challenge.www.sb-net.xyz: dial tcp 173.245.58.51:53: i/o timeout (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15206064/133889340) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":2,"retrying_in":120,"elapsed":108.425175204,"max_duration":2592000}
{"level":"info","ts":1597810140.6976151,"msg":"shutting down apps then terminating","signal":"SIGTERM"}
{"level":"info","ts":1597810140.6981497,"logger":"tls.cache.maintenance","msg":"stopped background certificate maintenance","cache":"0xc0006ff6c0"}
{"level":"info","ts":1597810140.6981559,"logger":"tls.obtain","msg":"releasing lock","identifier":"sb-net.xyz"}
{"level":"info","ts":1597810140.6981661,"logger":"tls.obtain","msg":"releasing lock","identifier":"www.sb-net.xyz"}
2020/08/19 04:09:00 [ERROR] Unable to clean up lock: remove /.local/share/caddy/locks/cert_acme_sb-net.xyz_acme-staging-v02.api.letsencrypt.org-directory.lock: no such file or directory (lock=cert_acme_sb-net.xyz_acme-staging-v02.api.letsencrypt.org-directory storage=FileStorage:/.local/share/caddy)
{"level":"error","ts":1597810140.7002604,"logger":"tls","msg":"job failed","error":"www.sb-net.xyz: obtaining certificate: context canceled"}
{"level":"error","ts":1597810140.7003047,"logger":"tls","msg":"job failed","error":"sb-net.xyz: obtaining certificate: context canceled"}
{"level":"info","ts":1597810140.7005022,"logger":"admin","msg":"stopped previous server"}
{"level":"info","ts":1597810140.7005885,"msg":"shutdown done","signal":"SIGTERM"}

I'm happy to provide more info. Thanks!

mholt commented 4 years ago

Your network setup seems a little crazy. What would be blocking dial tcp 173.245.58.51:53: i/o timeout? (Note that actually dial udp is tried first, as is the norm for DNS, but it falls back to tcp thanks to a recent PR by @Gansgar).

mholt commented 4 years ago

PS. I think adding the ability to configure a custom DNS resolver for the ACME challenges would probably solve the problem. What do you think? Also see https://github.com/go-acme/lego/issues/461

scoobybejesus commented 4 years ago

Okay, more progress.

(TL;DR - (a) DNS validation still doesn't fully complete for me, but (b) this issue #3664 may be ready to be closed.)

First, I goofed. I missed the :53 at the end of that IP. I had a rule to only allow port 53 to the pfSense interface (I should reserve that rule for my IOT networks only, I think). I should have caught that before replying, and we'd be a step closer by now. That's on me. Ugh. Sorry...

Second, this next failure has to do with the caddy jail being NAT'd (I think/assume). The jail doesn't have its own LAN IP. Rather, it is on a network inside the FreeBSD host (cloned loopback device, not that this detail should matter). The FreeBSD host is at 192.168.50.99. The jail's IP is 10.0.0.100, and it's only reachable through the host via port forwarding. I have host pf rules to (a) NAT out traffic from the jail network, and (b) port forward incoming 80 and 443 traffic to the jail.

Here's the log reflecting what happened after I disabled the port 53 pfSense rules:

{"level":"info","ts":1597859952.9849763,"msg":"using provided configuration","config_file":"/usr/local/etc/caddy/Caddyfile","config_adapter":"caddyfile"}
{"level":"info","ts":1597859952.9878924,"logger":"admin","msg":"admin endpoint started","address":"tcp/localhost:2019","enforce_origin":false,"origins":["localhost:2019","[::1]:2019","127.0.0.1:2019"]}
2020/08/19 17:59:12 [INFO][cache:0xc00063c120] Started certificate maintenance routine
{"level":"info","ts":1597859952.9884107,"logger":"http","msg":"server is listening only on the HTTPS port but has no TLS connection policies; adding one to enable TLS","server_name":"srv0","https_port":443}
{"level":"info","ts":1597859952.9884527,"logger":"http","msg":"enabling automatic HTTP->HTTPS redirects","server_name":"srv0"}
{"level":"info","ts":1597859952.9907737,"logger":"tls","msg":"cleaned up storage units"}
{"level":"info","ts":1597859952.990867,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["sb-net.xyz","www.sb-net.xyz"]}
{"level":"info","ts":1597859952.991112,"msg":"autosaved config","file":"/.config/caddy/autosave.json"}
{"level":"info","ts":1597859952.9911325,"msg":"serving initial configuration"}
Successfully started Caddy (pid=52153) - Caddy is running in the background
2020/08/19 17:59:12 [INFO][www.sb-net.xyz] Obtain certificate; acquiring lock...
2020/08/19 17:59:12 [INFO][sb-net.xyz] Obtain certificate; acquiring lock...
2020/08/19 17:59:12 [INFO][www.sb-net.xyz] Obtain: Lock acquired; proceeding...
2020/08/19 17:59:12 [INFO][sb-net.xyz] Obtain: Lock acquired; proceeding...
2020/08/19 17:59:33 [ERROR] Making new ACME client: get directory at 'https://acme-staging-v02.api.letsencrypt.org/directory': Get "https://acme-staging-v02.api.letsencrypt.org/directory": dial tcp: lookup acme-staging-v02.api.letsencrypt.org on 192.168.50.1:53: read udp 10.0.0.100:51693->192.168.50.1:53: i/o timeout (attempt 1/2)
2020/08/19 17:59:53 [ERROR] Making new ACME client: get directory at 'https://acme-staging-v02.api.letsencrypt.org/directory': Get "https://acme-staging-v02.api.letsencrypt.org/directory": dial tcp: lookup acme-staging-v02.api.letsencrypt.org on 192.168.50.1:53: read udp 10.0.0.100:32484->192.168.50.1:53: i/o timeout (attempt 1/2)
{"level":"info","ts":1597860000.5756843,"msg":"shutting down apps then terminating","signal":"SIGTERM"}
2020/08/19 18:00:00 [INFO][cache:0xc00063c120] Stopped certificate maintenance routine
{"level":"info","ts":1597860000.576581,"logger":"admin","msg":"stopped previous server"}
{"level":"info","ts":1597860000.5766163,"msg":"shutdown done","signal":"SIGTERM"}

Third, seeing as how the FreeBSD box (192.168.50.99) isn't listening on port 53, I figured I would try using pf to port forward 53 back to the caddy jail. [I suspect DNS validation isn't yet designed to handle this double-NAT/jail network setup, and I don't understand the nature of the jail's request to port 53 on the host (which, by the way, my jails can ping the host just fine), but] in the interest of researching it, here's that log (spoiler, it's identical, AFAICT):

{"level":"info","ts":1597862603.0907114,"msg":"using provided configuration","config_file":"/usr/local/etc/caddy/Caddyfile","config_adapter":"caddyfile"}
{"level":"info","ts":1597862603.094848,"logger":"admin","msg":"admin endpoint started","address":"tcp/localhost:2019","enforce_origin":false,"origins":["localhost:2019","[::1]:2019","127.0.0.1:2019"]}
2020/08/19 18:43:23 [INFO][cache:0xc000099620] Started certificate maintenance routine
{"level":"info","ts":1597862603.0959191,"logger":"http","msg":"server is listening only on the HTTPS port but has no TLS connection policies; adding one to enable TLS","server_name":"srv0","https_port":443}
{"level":"info","ts":1597862603.0960157,"logger":"http","msg":"enabling automatic HTTP->HTTPS redirects","server_name":"srv0"}
{"level":"info","ts":1597862603.103234,"logger":"tls","msg":"cleaned up storage units"}
{"level":"info","ts":1597862603.1034467,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["sb-net.xyz","www.sb-net.xyz"]}
{"level":"info","ts":1597862603.104059,"msg":"autosaved config","file":"/.config/caddy/autosave.json"}
{"level":"info","ts":1597862603.1041048,"msg":"serving initial configuration"}
Successfully started Caddy (pid=73154) - Caddy is running in the background
2020/08/19 18:43:23 [INFO][www.sb-net.xyz] Obtain certificate; acquiring lock...
2020/08/19 18:43:23 [INFO][sb-net.xyz] Obtain certificate; acquiring lock...
2020/08/19 18:43:23 [INFO][www.sb-net.xyz] Obtain: Lock acquired; proceeding...
2020/08/19 18:43:23 [INFO][sb-net.xyz] Obtain: Lock acquired; proceeding...
2020/08/19 18:43:43 [ERROR] Making new ACME client: get directory at 'https://acme-staging-v02.api.letsencrypt.org/directory': Get "https://acme-staging-v02.api.letsencrypt.org/directory": dial tcp: lookup acme-staging-v02.api.letsencrypt.org on 192.168.50.1:53: read udp 10.0.0.100:41787->192.168.50.1:53: i/o timeout (attempt 1/2)
2020/08/19 18:44:03 [ERROR] Making new ACME client: get directory at 'https://acme-staging-v02.api.letsencrypt.org/directory': Get "https://acme-staging-v02.api.letsencrypt.org/directory": dial tcp: lookup acme-staging-v02.api.letsencrypt.org on 192.168.50.1:53: read udp 10.0.0.100:12536->192.168.50.1:53: i/o timeout (attempt 1/2)
{"level":"info","ts":1597862652.6844192,"msg":"shutting down apps then terminating","signal":"SIGTERM"}
2020/08/19 18:44:12 [INFO][cache:0xc000099620] Stopped certificate maintenance routine
{"level":"info","ts":1597862652.6867392,"logger":"admin","msg":"stopped previous server"}
{"level":"info","ts":1597862652.6868255,"msg":"shutdown done","signal":"SIGTERM"}

Fourth, I tried to do extra homework this time. I got to thinking, "is the jail reaching out to its gateway thinking (from the code's point of view) that it would be outside my LAN at that point?" The jail's gateway is the FreeBSD host (obviously still inside the LAN). Well, I happen to have a little test box running FreeBSD, and I decided I was okay with muddying up the host system in the interest of providing you more information.

Side note: at this point, I triple checked my pfSense rules. I had a rule preventing communication with other subnets. Well, that's gone now. My only rule is the most permissive "allow any to any." One fewer variable to deal with.

So, on my little test box, still on FreeBSD 12.1-p2, on the host I downloaded the same xcaddy, built caddy according to your #3474 comment, and ran caddy from the host. It almost works! In fact, it successfully writes the TXT records to DigitalOcean (I logged into DO, and they now exist). Here's that log:

{"level":"info","ts":1597892547.29507,"msg":"using provided configuration","config_file":"/usr/local/etc/caddy/Caddyfile","config_adapter":"caddyfile"}
{"level":"info","ts":1597892547.3001564,"logger":"admin","msg":"admin endpoint started","address":"tcp/localhost:2019","enforce_origin":false,"origins":["[::1]:2019","127.0.0.1:2019","localhost:2019"]}
{"level":"info","ts":1597892547.3028007,"logger":"http","msg":"server is listening only on the HTTPS port but has no TLS connection policies; adding one to enable TLS","server_name":"srv0","https_port":443}
{"level":"info","ts":1597892547.3030362,"logger":"http","msg":"enabling automatic HTTP->HTTPS redirects","server_name":"srv0"}
{"level":"info","ts":1597892547.304845,"logger":"tls.cache.maintenance","msg":"started background certificate maintenance","cache":"0xc000382000"}
{"level":"info","ts":1597892547.3089988,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["sb-net.xyz","www.sb-net.xyz"]}
{"level":"info","ts":1597892547.309516,"logger":"tls","msg":"cleaned up storage units"}
{"level":"info","ts":1597892547.3099682,"msg":"autosaved config","file":"/.config/caddy/autosave.json"}
{"level":"info","ts":1597892547.310129,"msg":"serving initial configuration"}
{"level":"info","ts":1597892547.310941,"logger":"tls.obtain","msg":"acquiring lock","identifier":"www.sb-net.xyz"}
Successfully started Caddy (pid=28981) - Caddy is running in the background
{"level":"info","ts":1597892547.311012,"logger":"tls.obtain","msg":"acquiring lock","identifier":"sb-net.xyz"}
{"level":"info","ts":1597892547.312147,"logger":"tls.obtain","msg":"lock acquired","identifier":"www.sb-net.xyz"}
{"level":"info","ts":1597892547.3122058,"logger":"tls.obtain","msg":"lock acquired","identifier":"sb-net.xyz"}
{"level":"info","ts":1597892547.3573296,"logger":"tls.issuance.acme","msg":"waiting on internal rate limiter","identifiers":["www.sb-net.xyz"]}
{"level":"info","ts":1597892547.357354,"logger":"tls.issuance.acme","msg":"waiting on internal rate limiter","identifiers":["sb-net.xyz"]}
{"level":"info","ts":1597892547.3576071,"logger":"tls.issuance.acme","msg":"done waiting on internal rate limiter","identifiers":["sb-net.xyz"]}
{"level":"info","ts":1597892547.3574789,"logger":"tls.issuance.acme","msg":"done waiting on internal rate limiter","identifiers":["www.sb-net.xyz"]}
{"level":"info","ts":1597892548.1039498,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"info","ts":1597892548.3333673,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"www.sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1597892671.1043699,"logger":"tls.obtain","msg":"will retry","error":"[www.sb-net.xyz] Obtain: [www.sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: timed out waiting for record to fully propagate; verify DNS provider configuration is correct - last error: <nil> (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15234174/134267215) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":1,"retrying_in":60,"elapsed":123.792111632,"max_duration":2592000}
{"level":"error","ts":1597892671.4837039,"logger":"tls.obtain","msg":"will retry","error":"[sb-net.xyz] Obtain: [sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: timed out waiting for record to fully propagate; verify DNS provider configuration is correct - last error: <nil> (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15234174/134267213) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":1,"retrying_in":60,"elapsed":124.171262732,"max_duration":2592000}
{"level":"info","ts":1597892732.9170802,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"info","ts":1597892733.5099452,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"www.sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1597892855.3341348,"logger":"tls.obtain","msg":"will retry","error":"[sb-net.xyz] Obtain: [sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: timed out waiting for record to fully propagate; verify DNS provider configuration is correct - last error: <nil> (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15234174/134268344) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":2,"retrying_in":120,"elapsed":308.021694856,"max_duration":2592000}
{"level":"error","ts":1597892855.6598656,"logger":"tls.obtain","msg":"will retry","error":"[www.sb-net.xyz] Obtain: [www.sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: timed out waiting for record to fully propagate; verify DNS provider configuration is correct - last error: <nil> (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15234174/134268351) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":2,"retrying_in":120,"elapsed":308.347608007,"max_duration":2592000}
{"level":"info","ts":1597892867.0336976,"msg":"shutting down apps then terminating","signal":"SIGTERM"}
{"level":"info","ts":1597892867.0343637,"logger":"tls.cache.maintenance","msg":"stopped background certificate maintenance","cache":"0xc000382000"}
{"level":"info","ts":1597892867.0344317,"logger":"tls.obtain","msg":"releasing lock","identifier":"sb-net.xyz"}
{"level":"info","ts":1597892867.0344238,"logger":"tls.obtain","msg":"releasing lock","identifier":"www.sb-net.xyz"}
{"level":"error","ts":1597892867.0362594,"logger":"tls","msg":"job failed","error":"www.sb-net.xyz: obtaining certificate: context canceled"}
{"level":"info","ts":1597892867.0367157,"logger":"admin","msg":"stopped previous server"}
{"level":"info","ts":1597892867.036855,"msg":"shutdown done","signal":"SIGTERM"}

I tried this another time and let it go longer ("retrying_in {60, 120, 120, 300, 600, 1200}") and it never proceeded past that point. So then I figured, well, maybe it'll never satisfy/succeed if I'm using the 'staging' server. So I commented out the staging server in the Caddyfile and tried again on 'production'. Log:

{"level":"info","ts":1597894772.2518406,"msg":"using provided configuration","config_file":"/usr/local/etc/caddy/Caddyfile","config_adapter":"caddyfile"}
{"level":"info","ts":1597894772.2568228,"logger":"admin","msg":"admin endpoint started","address":"tcp/localhost:2019","enforce_origin":false,"origins":["localhost:2019","[::1]:2019","127.0.0.1:2019"]}
{"level":"info","ts":1597894772.2574024,"logger":"tls.cache.maintenance","msg":"started background certificate maintenance","cache":"0xc0004d9960"}
{"level":"info","ts":1597894772.2579455,"logger":"http","msg":"server is listening only on the HTTPS port but has no TLS connection policies; adding one to enable TLS","server_name":"srv0","https_port":443}
{"level":"info","ts":1597894772.258102,"logger":"http","msg":"enabling automatic HTTP->HTTPS redirects","server_name":"srv0"}
{"level":"info","ts":1597894772.2638984,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["sb-net.xyz","www.sb-net.xyz"]}
{"level":"info","ts":1597894772.2642107,"logger":"tls","msg":"cleaned up storage units"}
{"level":"info","ts":1597894772.2653003,"msg":"autosaved config","file":"/.config/caddy/autosave.json"}
{"level":"info","ts":1597894772.2655163,"msg":"serving initial configuration"}
{"level":"info","ts":1597894772.267524,"logger":"tls.obtain","msg":"acquiring lock","identifier":"sb-net.xyz"}
Successfully started Caddy (pid=29143) - Caddy is running in the background
{"level":"info","ts":1597894772.2690847,"logger":"tls.obtain","msg":"lock acquired","identifier":"sb-net.xyz"}
{"level":"info","ts":1597894772.26974,"logger":"tls.obtain","msg":"acquiring lock","identifier":"www.sb-net.xyz"}
{"level":"info","ts":1597894772.270708,"logger":"tls.obtain","msg":"lock acquired","identifier":"www.sb-net.xyz"}
{"level":"info","ts":1597894773.3435748,"logger":"tls.issuance.acme","msg":"waiting on internal rate limiter","identifiers":["sb-net.xyz"]}
{"level":"info","ts":1597894773.3436756,"logger":"tls.issuance.acme","msg":"done waiting on internal rate limiter","identifiers":["sb-net.xyz"]}
{"level":"info","ts":1597894773.7485409,"logger":"tls.issuance.acme","msg":"waiting on internal rate limiter","identifiers":["www.sb-net.xyz"]}
{"level":"info","ts":1597894773.7486722,"logger":"tls.issuance.acme","msg":"done waiting on internal rate limiter","identifiers":["www.sb-net.xyz"]}
{"level":"info","ts":1597894774.3542128,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-v02.api.letsencrypt.org/directory"}
{"level":"info","ts":1597894774.8023036,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"www.sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1597894899.2716472,"logger":"tls.obtain","msg":"will retry","error":"[www.sb-net.xyz] Obtain: [www.sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: timed out waiting for record to fully propagate; verify DNS provider configuration is correct - last error: <nil> (order=https://acme-v02.api.letsencrypt.org/acme/order/94385876/4768734626) (ca=https://acme-v02.api.letsencrypt.org/directory)","attempt":1,"retrying_in":60,"elapsed":127.000836093,"max_duration":2592000}
{"level":"error","ts":1597894899.5184891,"logger":"tls.obtain","msg":"will retry","error":"[sb-net.xyz] Obtain: [sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: timed out waiting for record to fully propagate; verify DNS provider configuration is correct - last error: <nil> (order=https://acme-v02.api.letsencrypt.org/acme/order/94385875/4768734525) (ca=https://acme-v02.api.letsencrypt.org/directory)","attempt":1,"retrying_in":60,"elapsed":127.249095625,"max_duration":2592000}
{"level":"info","ts":1597894959.9020786,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"www.sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"info","ts":1597894960.1495438,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1597895081.3062508,"logger":"tls.obtain","msg":"will retry","error":"[sb-net.xyz] Obtain: [sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: timed out waiting for record to fully propagate; verify DNS provider configuration is correct - last error: <nil> (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15234174/134282939) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":2,"retrying_in":120,"elapsed":309.036858182,"max_duration":2592000}
{"level":"error","ts":1597895081.6052024,"logger":"tls.obtain","msg":"will retry","error":"[www.sb-net.xyz] Obtain: [www.sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: timed out waiting for record to fully propagate; verify DNS provider configuration is correct - last error: <nil> (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15234174/134282934) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":2,"retrying_in":120,"elapsed":309.334391691,"max_duration":2592000}
{"level":"info","ts":1597895201.8336673,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"info","ts":1597895202.0996675,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"www.sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1597895224.2974553,"logger":"tls.obtain","msg":"will retry","error":"[www.sb-net.xyz] Obtain: [www.sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: checking DNS propagation of _acme-challenge.www.sb-net.xyz: NS ns1.digitalocean.com. returned SERVFAIL for _acme-challenge.www.sb-net.xyz. (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15234174/134283962) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":3,"retrying_in":120,"elapsed":452.02664432,"max_duration":2592000}
{"level":"error","ts":1597895324.7707915,"logger":"tls.obtain","msg":"will retry","error":"[sb-net.xyz] Obtain: [sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: timed out waiting for record to fully propagate; verify DNS provider configuration is correct - last error: <nil> (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15234174/134283961) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":3,"retrying_in":120,"elapsed":552.501398383,"max_duration":2592000}
{"level":"info","ts":1597895344.887175,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"www.sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"info","ts":1597895416.9334388,"msg":"shutting down apps then terminating","signal":"SIGTERM"}
{"level":"info","ts":1597895416.9347827,"logger":"tls.cache.maintenance","msg":"stopped background certificate maintenance","cache":"0xc0004d9960"}
{"level":"info","ts":1597895416.9347944,"logger":"tls.obtain","msg":"releasing lock","identifier":"sb-net.xyz"}
{"level":"error","ts":1597895416.9356732,"logger":"tls.issuance.acme.acme_client","msg":"cleaning up solver","identifier":"www.sb-net.xyz","challenge_type":"dns-01","error":"deleting temporary record for zone sb-net.xyz.: Delete \"https://api.digitalocean.com/v2/domains/sb-net.xyz/records/110786482\": context canceled"}
{"level":"warn","ts":1597895416.9378443,"logger":"tls.issuance.acme.acme_client","msg":"HTTP request failed; retrying","url":"https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/96904632","error":"performing request: Post \"https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/96904632\": context canceled"}
{"level":"error","ts":1597895416.9381714,"logger":"tls.issuance.acme.acme_client","msg":"deactivating authorization","identifier":"www.sb-net.xyz","authz":"https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/96904632","error":"request to https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/96904632 failed after 1 attempts: context canceled"}
{"level":"info","ts":1597895416.9386084,"logger":"tls.obtain","msg":"releasing lock","identifier":"www.sb-net.xyz"}
{"level":"error","ts":1597895416.939551,"logger":"tls","msg":"job failed","error":"sb-net.xyz: obtaining certificate: context canceled"}
{"level":"info","ts":1597895416.9400692,"logger":"admin","msg":"stopped previous server"}
{"level":"info","ts":1597895416.9403057,"msg":"shutdown done","signal":"SIGTERM"}

I looked right away and it replaced the TXT records, of course. Then I waited around until I saw the NS ns1.digitalocean.com. returned SERVFAIL for _acme-challenge.www.sb-net.xyz.. I double checked on the TXT records, and one was now missing and the other contained a different string. I ran grep SERVFAIL /var/log/caddy.log in case this happened elsewhere, but no. Perhaps the SERVFAIL is because I tried on a 'production' LE server. One way or another, close, but no cigar.



I also run FreeBSD in a VPS, and I run caddy in a jail, and it reverse proxies to several other jails. It's beautiful. Ports 80 and 443 are exposed, so it doesn't have to do the DNS challenge. Everything is gravy. I freaking love it. I wish I could get this working at home w/ the DNS challenge, even if I had to abandon the jail idea and just run it on the host - anything to get the same workflow. I could even live with no DNS split horizon; I'll just deal /etc/hosts. However, if the custom DNS resolver for ACME challenges would solve that, all the better! I'm happy to test anything, whether here or in another issue.

I feel bad taking up your time. I know this isn't a support forum. I hope my logs illuminate potential issues, though.

Anyway, at this point, this issue is technically solved ("incomplete URL" is solved by building caddy accordingly and/or waiting for it to be upstreamed to the appropriate spots). Feel free to close this either now or once any upstream dependencies are updated.

mholt commented 4 years ago

Thanks for the detailed update!

I'll read it in more detail as soon as I have a chance. But just wanted to say that now you can specify your own custom DNS resolvers when conducting the DNS challenge: 744d04c2585d50f64cf3d43d139c310a18e78f73

That might be useful in your situation... then CertMagic will know the right zone to operate in when it manipulates DNS records.

scoobybejesus commented 4 years ago

I decided I'd find a Raspberry Pi to try this on Linux. I built caddy the same as before (command below) and try again. I noticed something interesting....

The TXT records it creates in DigitalOcean are _acme-challenge.www.sb-net.xyz.sb-net.xyz and _acme-challenge.sb-net.xyz.sb-net.xyz.

They should just be _acme-challenge.www.sb-net.xyz and _acme-challenge.sb-net.xyz, right?

I thought the comma separated list may be an issue, so I hopped into the Caddyfile and added another subdomain.

bs.sb-net.xyz {
tls {
    dns digitalocean <my_token>
    }
}

Then in DO I see this: _acme-challenge.bs.sb-net.xyz.sb-net.xyz. It should be _acme-challenge.bs.sb-net.xyz, right?

Looking through certmagic and acme, I'm not finding what may doing this. I'm trying to get my bearings with who does what. Not a whole lot seems to happen at the libdns level, but I'm not a programmer by trade, so I dunno if I'll be able to solve this one... but I'll keep fiddling!


xcaddy build c94f5bb7dd3c98d6573c44f06d99c7252911a9fa --with github.com/caddyserver/certmagic@c2f851df75f0c52144a91e00ac53110034865faf --with github.com/caddy-dns/digitalocean

Raspbian Buster go version go1.14.7 linux/arm caddy version v2.2.0-rc.1.0.20200817221716-c94f5bb7dd3c h1:HTM6oIPHBM4gb2Kj4jBN6MxIYRh5CJPwNlaxqAFIYCk= xcaddy_0.1.5_linux_armv7


Aug 22 02:25:45 pi3b systemd[1]: Started Caddy.
Aug 22 02:25:45 pi3b caddy[9228]: caddy.HomeDir=/var/lib/caddy
Aug 22 02:25:45 pi3b caddy[9228]: caddy.AppDataDir=/var/lib/caddy/.local/share/caddy
Aug 22 02:25:45 pi3b caddy[9228]: caddy.AppConfigDir=/var/lib/caddy/.config/caddy
Aug 22 02:25:45 pi3b caddy[9228]: caddy.ConfigAutosavePath=/var/lib/caddy/.config/caddy/autosave.json
Aug 22 02:25:45 pi3b caddy[9228]: caddy.Version=v2.2.0-rc.1.0.20200817221716-c94f5bb7dd3c
Aug 22 02:25:45 pi3b caddy[9228]: runtime.GOOS=linux
Aug 22 02:25:45 pi3b caddy[9228]: runtime.GOARCH=arm
Aug 22 02:25:45 pi3b caddy[9228]: runtime.Compiler=gc
Aug 22 02:25:45 pi3b caddy[9228]: runtime.NumCPU=4
Aug 22 02:25:45 pi3b caddy[9228]: runtime.GOMAXPROCS=4
Aug 22 02:25:45 pi3b caddy[9228]: runtime.Version=go1.14.7
Aug 22 02:25:45 pi3b caddy[9228]: os.Getwd=/
Aug 22 02:25:45 pi3b caddy[9228]: LANG=en_US.UTF-8
Aug 22 02:25:45 pi3b caddy[9228]: PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Aug 22 02:25:45 pi3b caddy[9228]: HOME=/var/lib/caddy
Aug 22 02:25:45 pi3b caddy[9228]: LOGNAME=caddy
Aug 22 02:25:45 pi3b caddy[9228]: USER=caddy
Aug 22 02:25:45 pi3b caddy[9228]: INVOCATION_ID=17bd8503e99c441284a7955eeb35e99a
Aug 22 02:25:45 pi3b caddy[9228]: JOURNAL_STREAM=8:84230
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.599316,"msg":"using provided configuration","config_file":"/usr/local/etc/caddy/Caddyfile","config_adapter":""}
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.6138568,"logger":"admin","msg":"admin endpoint started","address":"tcp/localhost:2019","enforce_origin":false,"origins":["localhost:2019","[::1]:2019","127.0.0.1:2019"]}
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.6161613,"logger":"http","msg":"server is listening only on the HTTPS port but has no TLS connection policies; adding one to enable TLS","server_name":"srv0","https_port":443}
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.6162777,"logger":"http","msg":"enabling automatic HTTP->HTTPS redirects","server_name":"srv0"}
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.6174314,"logger":"tls.cache.maintenance","msg":"started background certificate maintenance","cache":"0x410d360"}
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.6245954,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["sb-net.xyz","www.sb-net.xyz","bs.sb-net.xyz"]}
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.6248546,"logger":"tls","msg":"cleaned up storage units"}
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.62597,"msg":"autosaved config","file":"/var/lib/caddy/.config/caddy/autosave.json"}
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.6260643,"msg":"serving initial configuration"}
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.6268609,"logger":"tls.obtain","msg":"acquiring lock","identifier":"sb-net.xyz"}
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.6279984,"logger":"tls.obtain","msg":"lock acquired","identifier":"sb-net.xyz"}
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.6283364,"logger":"tls.obtain","msg":"acquiring lock","identifier":"bs.sb-net.xyz"}
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.6288576,"logger":"tls.obtain","msg":"lock acquired","identifier":"bs.sb-net.xyz"}
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.6268609,"logger":"tls.obtain","msg":"acquiring lock","identifier":"www.sb-net.xyz"}
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.6304908,"logger":"tls.obtain","msg":"lock acquired","identifier":"www.sb-net.xyz"}
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.6395602,"logger":"tls.issuance.acme","msg":"waiting on internal rate limiter","identifiers":["sb-net.xyz"]}
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.639628,"logger":"tls.issuance.acme","msg":"waiting on internal rate limiter","identifiers":["bs.sb-net.xyz"]}
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.6396573,"logger":"tls.issuance.acme","msg":"done waiting on internal rate limiter","identifiers":["sb-net.xyz"]}
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.6397495,"logger":"tls.issuance.acme","msg":"done waiting on internal rate limiter","identifiers":["bs.sb-net.xyz"]}
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.641867,"logger":"tls.issuance.acme","msg":"waiting on internal rate limiter","identifiers":["www.sb-net.xyz"]}
Aug 22 02:25:45 pi3b caddy[9228]: {"level":"info","ts":1598059545.6419601,"logger":"tls.issuance.acme","msg":"done waiting on internal rate limiter","identifiers":["www.sb-net.xyz"]}
Aug 22 02:25:46 pi3b caddy[9228]: {"level":"info","ts":1598059546.6429644,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
Aug 22 02:25:46 pi3b caddy[9228]: {"level":"info","ts":1598059546.8980398,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"bs.sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
Aug 22 02:25:46 pi3b caddy[9228]: {"level":"info","ts":1598059546.9499397,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"www.sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
Aug 22 02:27:47 pi3b caddy[9228]: {"level":"error","ts":1598059667.847907,"logger":"tls.obtain","msg":"will retry","error":"[sb-net.xyz] Obtain: [sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: timed out waiting for record to fully propagate; verify DNS provider configuration is correct - last error: <nil> (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15273270/135380327) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":1,"retrying_in":60,"elapsed":122.219834152,"max_duration":2592000}
Aug 22 02:27:48 pi3b caddy[9228]: {"level":"error","ts":1598059668.1990032,"logger":"tls.obtain","msg":"will retry","error":"[bs.sb-net.xyz] Obtain: [bs.sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: timed out waiting for record to fully propagate; verify DNS provider configuration is correct - last error: <nil> (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15273270/135380329) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":1,"retrying_in":60,"elapsed":122.570096542,"max_duration":2592000}
Aug 22 02:27:48 pi3b caddy[9228]: {"level":"error","ts":1598059668.8707266,"logger":"tls.obtain","msg":"will retry","error":"[www.sb-net.xyz] Obtain: [www.sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: timed out waiting for record to fully propagate; verify DNS provider configuration is correct - last error: <nil> (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15273270/135380330) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":1,"retrying_in":60,"elapsed":123.240177297,"max_duration":2592000}
Aug 22 02:28:48 pi3b caddy[9228]: {"level":"info","ts":1598059728.4080448,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
Aug 22 02:28:48 pi3b caddy[9228]: {"level":"info","ts":1598059728.7618463,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"bs.sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
Aug 22 02:28:49 pi3b caddy[9228]: {"level":"info","ts":1598059729.4292448,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"www.sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
Aug 22 02:30:49 pi3b caddy[9228]: {"level":"error","ts":1598059849.6159885,"logger":"tls.obtain","msg":"will retry","error":"[sb-net.xyz] Obtain: [sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: timed out waiting for record to fully propagate; verify DNS provider configuration is correct - last error: <nil> (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15273270/135381158) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":2,"retrying_in":120,"elapsed":303.987915397,"max_duration":2592000}
Aug 22 02:30:51 pi3b caddy[9228]: {"level":"error","ts":1598059851.4210622,"logger":"tls.obtain","msg":"will retry","error":"[bs.sb-net.xyz] Obtain: [bs.sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: timed out waiting for record to fully propagate; verify DNS provider configuration is correct - last error: <nil> (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15273270/135381162) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":2,"retrying_in":120,"elapsed":305.792153901,"max_duration":2592000}
Aug 22 02:30:52 pi3b caddy[9228]: {"level":"error","ts":1598059852.0089388,"logger":"tls.obtain","msg":"will retry","error":"[www.sb-net.xyz] Obtain: [www.sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: timed out waiting for record to fully propagate; verify DNS provider configuration is correct - last error: <nil> (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15273270/135381167) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":2,"retrying_in":120,"elapsed":306.378390254,"max_duration":2592000}

I would have let that go longer, but if the TXT records are wrong, then it seems pointless. (Also, before adding the bs subdomain, I let it run for another 15-20 min w/ no luck.)

mholt commented 4 years ago

Thanks so much for looking into it; I'm still pretty swamped for another week, so I appreciate it!

And yes, you're right, the subdomains are wrong... seems like the zone lookups are still weird... or maybe something wrong with how the DO API is being used.

francislavoie commented 4 years ago

Can you make sure you're building with the latest commit of the libdns/digitalocean dependency? https://github.com/libdns/digitalocean/commit/f11d70f2506c1a0a9dad574e42d3767e14cb4a39 the fix in the latest commit seems relevant to your issue with the doubled-up domain.

Nevermind I don't think that's the issue, noticed the comments on the issue linked to that commit 👍

mholt commented 4 years ago

@scoobybejesus It was recently brought to my attention that the DigitalOcean DNS provider implementation is errant; and a patch was proposed here: https://github.com/libdns/digitalocean/pull/3

We're in the process of looking into that from the libdns side, but could you please try with that patch and see if the error still occurs? I think there's a high likelihood it could fix it.

scoobybejesus commented 4 years ago

TL;DR - It works!

TL;DR 2 - Something's maybe a little funky. Perhaps some weird caching thing causing staging and prod LE validation to sometimes fail? Or maybe the weird behavior is solely related to being rate limited? I'll get back to this...

I wanted to reply last night, but I kept "getting almost" there and hitting roadblocks. Hopefully the reply is helpful, if verbose.


I cloned delthas' libdns/digitalocean repo, checked out his branch, and built with:

./xcaddy build master --with github.com/caddy-dns/digitalocean --with github.com/libdns/digitalocean=/root/go/src/github.com/digitalocean

My Caddyfile is like so:

{
        email myemail@example.com
}

sb-net.xyz, www.sb-net.xyz {

        root * /usr/local/www/html/
        file_server { index index.html }
        tls  {
                issuer acme {
                        dir https://acme-staging-v02.api.letsencrypt.org/directory
                        resolvers 208.67.222.222:53
                        dns digitalocean MYTOKEN
                }
        }
}

I first built it on a 12.1-release FreeBSD host (not in a jail). Since the custom resolver feature (tls/issuer subdirective, as shown in the Caddyfile above) is now available, I tried running caddy after putting the split DNS override back in my pfSense DNS Resolver. Score! It works! (Both the resolvers and, more importantly, the validation through DO.)


So now it's time to build in a jail. Different box, same version of FreeBSD, and basically the same Caddyfile (but I added another two sites, bs and st). It initially appeared to cooperate similarly. Oddly, www.sb-net.xyz and plain sb-net.xyx worked right away. There was some problem with bs.sb-net.xyz. I didn't see anything special in the log.

st.sb-net.xyz worked right away (cert received, etc). While tweaking the config for st.sb-net.xyz (because the reverse_proxy didn't seem to be cooperating), I decided to start and stop caddy a few times (I thought the Caddyfile would hot reload, but it acted like it didn't, so I stopped and started a few times), and after 15 or 20 minutes the DNS validation for bs.sb-net.xyz succeeded for some reason. Aside from the hiccup, so far so good.


Up til now, I'd been using the acme staging site. I might as well take the training wheels off.

Still in the jail... in the Caddyfile, I commented out the issuer dir for staging (in every tls block, I commented the dir https://acme-staging-v02.api.letsencrypt.org/directory line), and I added a final site (bw), and when firing up caddy again it worked perfectly... almost.

sb-net, www.sb-net, and st.sb-net worked right away. It took a minute or two for bs.sb-net to work, but it did. But then here I am over an hour later still seeing the following in my log for the final local site, bw. (This log is after starting caddy fresh, around an hour later.)

{"level":"info","ts":1600404559.9481888,"msg":"using provided configuration","config_file":"/usr/local/etc/caddy/Caddyfile","config_adapter":"caddyfile"}
{"level":"info","ts":1600404559.9541116,"logger":"admin","msg":"admin endpoint started","address":"tcp/localhost:2019","enforce_origin":false,"origins":["[::1]:2019","127.0.0.1:2019","localhost:2019"]}
{"level":"info","ts":1600404559.9543822,"logger":"tls.cache.maintenance","msg":"started background certificate maintenance","cache":"0xc0005104d0"}
{"level":"info","ts":1600404559.9546177,"logger":"http","msg":"server is listening only on the HTTPS port but has no TLS connection policies; adding one to enable TLS","server_name":"srv0","https_port":443}
{"level":"info","ts":1600404559.9546623,"logger":"http","msg":"enabling automatic HTTP->HTTPS redirects","server_name":"srv0"}
{"level":"info","ts":1600404559.9555206,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["sb-net.xyz","www.sb-net.xyz","bs.sb-net.xyz","st.sb-net.xyz","bw.sb-net.xyz"]}
{"level":"info","ts":1600404559.9609635,"logger":"tls","msg":"cleaned up storage units"}
{"level":"info","ts":1600404559.9741375,"msg":"autosaved config","file":"/.config/caddy/autosave.json"}
{"level":"info","ts":1600404559.9741611,"msg":"serving initial configuration"}
Successfully started Caddy (pid=40564) - Caddy is running in the background
{"level":"info","ts":1600404559.9848945,"logger":"tls.obtain","msg":"acquiring lock","identifier":"bw.sb-net.xyz"}
{"level":"info","ts":1600404559.9852104,"logger":"tls.obtain","msg":"lock acquired","identifier":"bw.sb-net.xyz"}
{"level":"info","ts":1600404559.9963574,"logger":"tls.issuance.acme","msg":"waiting on internal rate limiter","identifiers":["bw.sb-net.xyz"]}
{"level":"info","ts":1600404559.996411,"logger":"tls.issuance.acme","msg":"done waiting on internal rate limiter","identifiers":["bw.sb-net.xyz"]}
{"level":"info","ts":1600404560.6946976,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"bw.sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1600404683.695733,"logger":"tls.obtain","msg":"will retry","error":"[bw.sb-net.xyz] Obtain: [bw.sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: timed out waiting for record to fully propagate; verify DNS provider configuration is correct - last error: <nil> (order=https://acme-v02.api.letsencrypt.org/acme/order/94058005/5242844224) (ca=https://acme-v02.api.letsencrypt.org/directory)","attempt":1,"retrying_in":60,"elapsed":123.710497714,"max_duration":2592000}
{"level":"info","ts":1600404744.3282156,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"bw.sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1600404865.5694358,"logger":"tls.obtain","msg":"will retry","error":"[bw.sb-net.xyz] Obtain: [bw.sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: timed out waiting for record to fully propagate; verify DNS provider configuration is correct - last error: <nil> (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15177295/151206275) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":2,"retrying_in":120,"elapsed":305.584200513,"max_duration":2592000}
{"level":"info","ts":1600404986.0740097,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"bw.sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1600405108.5031276,"logger":"tls.obtain","msg":"will retry","error":"[bw.sb-net.xyz] Obtain: [bw.sb-net.xyz] solving challenges: waiting for solver *certmagic.DNS01Solver to be ready: timed out waiting for record to fully propagate; verify DNS provider configuration is correct - last error: <nil> (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15177295/151207337) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":3,"retrying_in":120,"elapsed":548.517891763,"max_duration":2592000}
{"level":"info","ts":1600405229.097423,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"bw.sb-net.xyz","challenge_type":"dns-01","ca":"https://acme-staging-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1600405339.832697,"logger":"tls.issuance.acme.acme_client","msg":"challenge failed","identifier":"bw.sb-net.xyz","challenge_type":"dns-01","status_code":403,"problem_type":"urn:ietf:params:acme:error:unauthorized","error":"Incorrect TXT record \"TzTRQMhUpexnr_eKUUfSmXEppDCEccQiJZQ_mZe2TSg\" found at _acme-challenge.bw.sb-net.xyz"}
{"level":"error","ts":1600405339.8327549,"logger":"tls.issuance.acme.acme_client","msg":"validating authorization","identifier":"bw.sb-net.xyz","error":"authorization failed: HTTP 403 urn:ietf:params:acme:error:unauthorized - Incorrect TXT record \"TzTRQMhUpexnr_eKUUfSmXEppDCEccQiJZQ_mZe2TSg\" found at _acme-challenge.bw.sb-net.xyz","order":"https://acme-staging-v02.api.letsencrypt.org/acme/order/15177295/151208418","attempt":1,"max_attempts":3}
{"level":"error","ts":1600405341.1280332,"logger":"tls.obtain","msg":"will retry","error":"[bw.sb-net.xyz] Obtain: [bw.sb-net.xyz] solving challenges: bw.sb-net.xyz: no solvers available for remaining challenges (configured=[dns-01] offered=[http-01 dns-01 tls-alpn-01] remaining=[http-01 tls-alpn-01]) (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/15177295/151208942) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)","attempt":4,"retrying_in":300,"elapsed":781.142798041,"max_duration":2592000}

Note that the first try is through the regular acme server, and the subsequent tries are through staging.

I must put this down. It's bedtime.
/me goes to sleep


I just tried again this morning. I started caddy, refreshed my DO console, and caddy had written a TXT record that returns "vwisep..." and then I get an error in the log with the different TXT record value:

{"level":"error","ts":1600433436.855171,"logger":"tls.issuance.acme.acme_client","msg":"challenge failed","identifier":"bw.sb-net.xyz","challenge_type":"dns-01","status_code":403,"problem_type":"urn:ietf:params:acme:error:unauthorized","error":"Incorrect TXT record \"XTW8UZqEmX81t6zSXzn_IC8YaYoj5A9FS-PQtMWQhy0\" found at _acme-challenge.bw.sb-net.xyz"}
{"level":"error","ts":1600433436.855368,"logger":"tls.issuance.acme.acme_client","msg":"validating authorization","identifier":"bw.sb-net.xyz","error":"authorization failed: HTTP 403 urn:ietf:params:acme:error:unauthorized - Incorrect TXT record \"XTW8UZqEmX81t6zSXzn_IC8YaYoj5A9FS-PQtMWQhy0\" found at _acme-challenge.bw.sb-net.xyz","order":"https://acme-v02.api.letsencrypt.org/acme/order/94058005/5248329879","attempt":1,"max_attempts":3}

Why would the TXT record not match? It literally wrote the record, did a check on it, and the check returned a different value than it expected? That seems odd. Is it supposed to try once with normal acme, and then revert to staging if it fails on the first try? Do you know how I can see why it might be failing? The tls {...} portions of the Caddyfile are all identical.

I wish I could have just said "it works perfectly!"

/me goes downstairs to work


/me takes a break from work

Trying again later, when I change the Caddyfile back to staging (i.e., when I uncomment the dir https://acme-staging-v02.api.letsencrypt.org/directory line for bw, then the validation works fine, and I've got the fake LE cert successfully. But then I comment that line out again and it fails again.

/me goes back to work


More info. I just noticed that my bs site has a "real" LE cert, but the st, www, and plain sb-net.xyz all have fake certs. Okay, new plan. Go into /.local/share/caddy/certificates and delete everything inside the acme-staging{...} directory. Re-run caddy. Results?

sb-net.xyz, www.sb-net.xyz, st.sb-net.xyz, and bs.sb-net.xyz validated right away and have real LE certs, but bw.sb-net.xyz failed to validate again. I stopped caddy and left for 20 min. I came back and ran it again and it worked (bw.sb-net.xyz validated just fine).

So we're all good!

Perhaps some of my issues were regarding being rate-limited. Perhaps there was some caching along the way by caddy that made the behaviour confusing. I don't know. And I wish I could give you better data than this. But long story short, this patch appears to be exactly what I needed, because I'm up and running!