Open Jonher937 opened 4 days ago
Hello,
you can already do that:
--dns.resolvers
allows defining the DNS used to check the propagation--dns.disable-cp
allows to skip the "classic" authoritative NSs.We have tried the disable-cp option: By setting this flag to true, disables the need to await propagation of the TXT record to all authoritative name servers.
Our problem is the reverse, the authoritative name servers replies correct and do so straight away. But the servers the PKI uses for validation are slow to serve the record.
What we need to do is either:
dns.propagation-wait
option and hope that it has propagated to the DNS-service used by the PKI to do the TXT lookup--dns.resolvers
option documents it's only used to:
Set the resolvers to use for performing (recursive) CNAME resolving and apex domain determination. For DNS-01 challenge verification, the authoritative DNS server is queried directly. source for the quote
I tried using the options you suggested but it does not wait for any propagation at all, unless of course I use --dns.propagation-wait
in addition to those values.
We need to verify the propagation has occured on the non-authoritative DNS-service before telling the PKI it's ready to do the DNS validation. If we don't the PKI will check with DNS-service which does not know about this TXT value and it will fail the issuing of the certificate.
Today we looked into why cert-manager has had more success and it looks like cert-manager has this option: dns01-recursive-nameservers-only documented here
--dns01-recursive-nameservers-only Forces cert-manager to only use the recursive nameservers for verification. Enabling this option could cause the DNS01 self check to take longer due to caching performed by the recursive nameservers.
s.Context.DNS01CheckAuthoritative
would be false, which means util.PreCheckDNS will use the user-specified nameservers instead of the Authoritative, just like we would want with LEGO.
Makes sense to me to actually verify the TXT record on the provided recursive nameserver to minimize errors and load towards the PKI. Feels like a neater solution than the recently implemented --dns.propagation-wait
flag.
I guess it's not needed in a regular setup (even with split-horizon DNS) if the PKI queries the authoritative nameserver, but otherwise there's high risk that the PKI would fail the challenge due to the zone not having refreshed on the recursor when the ACME client claims it's ready for challenge verification.
--dns.resolvers
option documents it's only used to:Set the resolvers to use for performing (recursive) CNAME resolving and apex domain determination. For DNS-01 challenge verification, the authoritative DNS server is queried directly. source for the quote
This option is not only used for zone detection and CNAME resolving, it's also used during propagation checks.
The NSs from --dns-resolvers
are used first, before authoritative NS, but there is a difference with authoritative NS: lego browses all the resolvers and continues the process if at least one resolver returns a successful answer.
This is why I said that --dns-resolvers
+ --dns.disable-cp
will do the same thing as your proposal.
But as also I said, lego will not check all the --dns-resolvers
during the propagation check.
I thought about this issue, and I found 2 solutions:
The second option is better because it allows checking all the recursive NSs and all the authoritative NSs.
With this option lego can have several interesting combinations:
dns.disable-propagation-ans
=> check only one recursive NSsdns.propagation-rns
=> check all the recursive NSs and all the authoritative NSsdns.propagation-rns
+ dns.disable-propagation-ans
(aka dns.disable-cp
) => check only recursive NSsThis can be changed in the future (inside a major version) to dns.disable-propagation-ans
and dns.disable-propagation-rns
and by default checking all the recursive NSs and all the authoritative NSs.
IMHO, the migration path will be easier with this approach.
I opened PR #2284
Welcome
How do you use lego?
Binary and Traefik.
Detailed Description
Hi, we've hit an issue using DNS-01 challenge with EAB and CNAME delegation with EJBCA as the CA, but this deployment scenario is probably going to be more common as ACME has started to be adopted in enterprise.
I have included a diagram to try and explain our interpretation of the current flow, and where we see an issue.
--dns.resolvers
that points to the same DNS-server as the PKI will query for the challenge. These servers are used by lego to find the authoritative servers only as specified here. It finds the authoritative servers and can successfully query to verify they serve the TXT record with the correct value.So now to the problem:
Essentially we'd want something like this (behind a flag such as
--dns.propagation-servers
of[]string
type) to specify what additional servers we'd want to append to thecheckAuthoritativeNss
function.https://github.com/Jonher937/lego/blob/a152249a1a02146604936099d3bf1a9d13999280/challenge/dns01/precheck.go#L76-L88
Full commit can be found here and acts as an example. I have not found a good way to pass down a flag this deep into the process, but this issue is created as a topic for discussing how and if this could be implemented.
In the end our issues comes down to the DNS topology/implementation and slow (60+ seconds) propagations to the company DNS-service which the PKI verifies against.
We have also tries with the newly added
--dns.propagation-wait
option and have successfully managed to obtain certificates if we tweak it high enough for the propagation to happen, but this might randomly fail if the zone update has not yet made it's way to the company DNS-service.