Open burdakovd opened 6 years ago
How frequently is the server migrated to another machine? Even if you solve the ACME-DNS problem, you may start running into Let's Encrypt's rate limits if the migration happens frequently and you're creating a new certificate every time.
Personally I'd consider including the acme-dns credentials (both from the acme.sh client and ACME-DNS database) as part of your server's base configuration. Those values are static and don't need to change, correct?
Also, are you using wildcard certs? If not, it might be better to use the HTTP validation method instead; that would likely be much simpler for this use case.
Yeah, I'm using wildcard certificates. I.e. one nginx instance is handling the following addresses, so I'm generating a multi-certificate that includes all of them:
Normally migration is happening very rarely, however during initial development stages it may happen more often (i.e. several times a week).
The reason I don't want to migrate data, is I am trying to keep machines autonomous and transparent, i.e. I start AWS instance from known AMI and with small bootstrap user data (which is made public), and then it runs on its own without providing ssh access to a human, and with very limited ways of changing its state from outside. If I need to change the code, I provision new machine that can be audited as well and is untamperable after that. During initial development iterations, reprovisioning may happen more often.
If I start migrating some data between machines, they won't be starting from clean slate anymore, and it may be a bit harder to audit their state, and harder to keep the authentication data secret, while maintaining visibility into what has been deployed.
I do run into some LetsEncrypt limits recently, but I think that will get resolved, once the provisioning script stabilizes and I won't need to re-run it that often.
I may consider setting up some protocol for how I can supply machine with initial certificate without risking it leaking, or how old machine can transfer it to the new one, but it may require some redesign
In the meantime, I ended up writing a dumb Python DNS server that just responds to all DNS requests with TXT records from a supplied file.
The acme-dns credentials don't need to be secret though, right? You're only allowing access to the API through localhost, so couldn't those credentials be part of the bootstrap data?
@Ajedi32 I see what you are saying.
That would mean I need to commit them to repository - which is OK I think. However it would also mean I'd need to somehow prepare (and store in repository) version of acme-dns database that has that account registered. (otherwise if I store just account data from acme.sh, acme-dns won't recognize that account after redeployment).
Yeah, it's a somewhat awkward solution but it should work.
A config option in acme-dns to disable authentication might be cleaner.
@burdakovd What did you end up doing?
Hi, Please be aware -- the LIMIT 2 in db.go broke for my employer also with renewals. Our fix was to remove the LIMIT 2 and then our renewal process worked again.
Kind regards, rbastic
I got bit by this issue as well while attempting to generate several certificates through cert-manager. I ended up just deleting all of my Certificate resources and applying them back to my Kubernetes cluster one at a time so that the 2 txt record response limit didn't prevent cert-manager from retrieving the correct corresponding value for each certificate's challenge.
Is there a technical or dns spec reason for only having two txt record responses at a time?
This is so cursed IMHO:
https://github.com/joohoi/acme-dns/blob/68bb6ab654b6fb1fe375e08807688c55621513a2/db.go#L168-L169
https://github.com/joohoi/acme-dns/blob/68bb6ab654b6fb1fe375e08807688c55621513a2/db.go#L256
Consider at least use config variable for this, instead of hardcoded limits.
While the implementation might well be cursed, the feature is intentional.
It is there to ensure the first priority and design goal of this project: security and limiting the impact of a box getting compromised.
The limit would be one, but there's a legimate use case of needing two records to share credentials, and that is having a wildcard name in addition to the "root" name. This is because the acme validation record name for wildcard is the same as the root domain itself.
So, why not just add a setting to the config? Sometimes it would be very helpful, because we have domains like "*.sub.domian.tld", "*.domain.tld", "domain.tld".
If anyone is interested, I made a config option for this - https://github.com/krigga/acme-dns/tree/config-txt-number Please note however that the code I added was never run by me, so use it at your own risk.
I would still argue that the correct behavior would be to keep the limit and to use unique acme-dns credential set for every unique (sub)domain in the SAN list. The client implementation should support this.
The main point is to make it hard for users to end up with an insecure (or suboptimal) setup. Another point is to avoid ending up in really hard to debug situations where for example the DNS provider or service is unable to serve a large amount of TXT records for a single request.
I think Let's Encrypt supports up to 100 domain names in SAN for a single certificate and the server would need to respond with all of those 100 TXT records for 100 times. The UDP DNS responses have size limits after which the packets get fragmented and I would assume this would have to be remedied by an arbitruary limit again.
The limit is in place to keep it simple in a big scale.
For those who feel they need multi value TXT records, is the motivation for stuffing TXT records primarily to avoid multiple acme-dns registrations or it avoidance of registrations due to migration? Anything else?
A couple of techniques you can use:
host your acme-dns database in a separate/cloud instance so that the registration data survives any migration
CNAME your _acme-dns records to an intermediate zone that supports a scripting API. e.g. Google Cloud DNS, Route53 etc, pointing to your final acme-dns TXT records, that way you can always update the intermediate zone without affecting your original _acme-challenge CNAMES (some of which may be in customer domains or ones that are more difficult to update).
If the issue is more around the annoyance/frequency of the initial registration per domain, you could adapt your acme-dns client to auto create the initial CNAME in your domains using a script. This is for larger scale deployments that can afford the time to develop automations. Obviously if you can already script DNS updates you could just skip acme-dns but that still offers a least-privilege/least-responsibility approach for frequent TXT record updates. acme-dns is simple enough in terms of API that you can also build a custom implementation fairly quickly (i.e. one developer for 1-2 weeks) to support your specific use case. Again, this starts to make sense if you are scaling a larger system. I built a custom implementation last week using cloudflare workers and google DNS for an intermediate CNAME zone that in turn can point to the TXT records (hosted in any zone or service).
It's probably bad practice, but my original intent was to issue root cert and wildcard cert for two different domains that my webserver was going to be handling.
Because I now know there is a limitation of two TXT records per cert, I have found another way of doing what I want (generating a new cert for each virtualhost that needs two subdomains).
Perhaps if you're not willing to make it a configuration option, at least put it in the documentation that this limit exists so others don't end up here by googling why it's not working. :)
@colwellkr A (let's encrypt) cert can have up to 100 domains, including wildcards etc. You then need to validate you control each domain before they will issue the certificate. To do this with acme-dns you need to register once with the acme-dns service for each domain and create the required CNAME in DNS. Then, subsequent updates set the TXT record (per domain) on the acme-dns service and Let's Encrypt can follow each _acme-challenge CNAME and see that you have completed the challenge (via acme-dns).
There is no upper limit to how many domains acme-dns can handle, it doesn't care. For you it's just that you need to setup an initial CNAME for each domain, after that there's no more work to do.
The trick with wildcard vs domain (*.example.com
vs example.com
) is just that ACME says they both need to validate using the same TXT record (which I consider to be a defect in the original dns-01 idea, but it doesn't matter too much). So that's the only reason acme-dns needs to support 2 values for TXT records.
Arguments regarding the number of TXT record values sometime seem to be attempting to stuff a TXT record with all the possible challenge responses you'll ever need for any of your domains, with the aim of skipping the per-domain registration step. However even if you do that you'd still need to setup an _acme-challenge CNAME on every domain/subdomain pointing to a TXT record anyway.
I had all of the CNAMES set up correctly, the problem was the TXT records. I use the acme.sh client most of the time, so the command I was running was:
acme.sh --issue --dns dns_acmedns -d example.com -d *.example.com -d foo.bar -d *.foo.bar
And acme.sh would set the TXT record for example.com
, then set for *.example.com
, and finally for *.foo.bar
. The problem is that only the last two domains entered would validate because the only TXT records the server was holding were the ones for the last two.
So my solutions is just going to to issue a cert for every virtual host where I need two domains:
ie. -d vhost.example.com -d vhost.foo.bar
@colwellkr ah that's a limitation of acme.sh, it gets you to set your registration (which is domain specific) as an environment variable and I think it can only do one at a time. Some acme clients (such as my own https://certifytheweb.com) support multiple authorization configs, so you can mix the config settings as required in order to compile a cert with multiple validation methods. You can probably do it with certbot as well but I don't know the specific command.
Even if acme.sh supported multiple authorization configs, I still would want to only use a single one. This is just a unnecessary limitation. Keep the default at 2 but make it configurable. If security is the only concern, then being hardcoded to 2 won't solve anything. If someone gains access to modify the config, acme-dns will be a minor problem and someone could just startup another dns server.
Anyway: Thanks for your work. It's awesome despite that limitation.
In my scenario
acme-dns
is hosted on the same machine as the http server that requests certificate, so it can renew certificates automatically forever (with acme credentials stored on local disk).However, whenever the whole server is migrated to another machine, subdomain changes unless I migrate the local auth data that those two services established between each other on old machine.
Since subdomain changes, I need to manually update
_acme-challenge
to newly generated subdomain so thatacme-dns
on new machine can receive DNS requests. Since I don't want to update many records, I reuse the same acme-dns subdomain for multuple domains in a certificate. I.e. I point_acme-challenge
of multiple domains (i.e. foo.example.com, bar.example.com, example.com, *.example.com) to a singleacme-challenge.example.com
, and thenacme-challenge.example.com
tolong-acme-generated-domain.acme-dns.example.com
. In that case when a machine migrated and generated subdomain changed, I only need to update one mapping (fromacme-challenge.example.com
tolong-acme-generated-domain.acme-dns.example.com
, but not each individual subdomain that the certficates are generated for).However, it appears in that case ACME challenge does not succeed reliably, as new records override old ones, and when I want to verify say 5 domains, only two get valid records.
So I guess several things could be done to make it better:
acme.sh
) to update all txt records, and then verify all domains.I guess in my case (again, local mode), I'd be happy with even simpler case:
It seems in my case the complexity of acme-dns (which was designed to be deployed publicly and used by many clients) gets in the way.