nodecraft / acme-dns-01-cloudflare

Cloudflare DNS for Let's Encrypt / ACME dns-01 challenges with Greenlock.js and ACME.js
MIT License
16 stars 5 forks source link

Fails to resolve challenge when requesting wildcard of subdomain #1

Closed RevolvingDCON closed 5 years ago

RevolvingDCON commented 5 years ago

Hi guys! When requesting the format *.sub.example.com the challenge appears to fail with the following exception:

{ Error: queryTxt ENOTFOUND _acme-challenge.dev.mydomain.io
    at QueryReqWrap.onresolve [as oncomplete] (dns.js:196:19)
  errno: 'ENOTFOUND',
  code: 'ENOTFOUND',
  syscall: 'queryTxt',
  hostname: '_acme-challenge.dev.mydomain.io' }
Waiting for 10000 ms before attempting propagation verification retry 1 / 30.

Error: Could not verify DNS for _acme-challenge.dev.mydomain.io
    at Function.verifyPropagation (C:\DStack\apps\mydomain\ssl\development\node_modules\acme-dns-01-cloudflare\index.js:121:12)
Waiting for 10000 ms before attempting propagation verification retry 2 / 30.

[acme-v2] handled(?) rejection as errback:
Error: [acme-v2] (E_STATE_INVALID) challenge state for '*.dev.mydomain.io': 'invalid'
    at C:\DStack\apps\mydomain\ssl\development\node_modules\acme-v2\node.js:518:29
    at process._tickCallback (internal/process/next_tick.js:68:7)

Reporting here due to errback stating an issue with the 'challenge state'

Edit: Only appears to fail when requesting; dev.mydomain.io,*.dev.mydomain.io

Appears to get stuck adding acme-challenge.dev.mydomain.io

coolaj86 commented 5 years ago

I can't say for certain because I haven't tested this module yet, but I think that this is due to the TXT record being set once, cached, set again, and then being pulled from cache rather than getting the correct record.

If that's the problem I do have a fix in the works, but I probably won't be able to get to it until next week.

This week I'm working on updating the dns-01 test harness so that things go more smoothly and get caught sooner.

RevolvingDCON commented 5 years ago

Just done some research, it appears your assumption is correct, once the acme-challenge is set it pulls the existing (incorrect) record from the cache and assumes the test has passed. Potential fix could be checking the current record with the expected one?

Also the dns-01 test harness looks awesome! Will this project be deprecated in favor of the new one?

coolaj86 commented 5 years ago

No, it’s just an order-of-operations bug for the specific use case of bare domain + wildcard, since they use the records of the same name.

Instead of

It needs to be

I have this fixed in another branch of the acme.js library, but basically there’s a just nasty merge conflict I have to sort through.

RevolvingDCON commented 5 years ago

Oh interesting. Glad you found the source of the bug!

Cherry commented 5 years ago

Hi all, thanks so much for investigating this!

@solderjs I'll be sure to update this to support https://git.rootprojects.org/root/acme-dns-01-test.js instead of https://git.rootprojects.org/root/acme-challenge-test.js soon. 👍

coolaj86 commented 5 years ago

@Cherry Doing that will be great, but you shouldn't have to change anything. I need to get a nasty manual merge done in acme.js and I think this will be resolved.

I did update the tests with significant improvements, but the repo name change is just because the documentation for the http-01 tests vs the dns-01 tests was diverging more and more, becoming confusing.

Also, the tests are just slightly ahead of ACME.js. They require an optimization which isn't yet handled in ACME.js. I should have the new version published tonight.

I may start on nasty merge tomorrow, but it probably won't be done tomorrow.

Cherry commented 5 years ago

@solderjs Ah great - I added the new zones function in the latest release anyway, which seems to be the only necessary addition for tests.

I'll await your notice for when that nasty merge is done and we can go ahead and verify this is resolved and close it out. Thanks again for taking a look at this.

coolaj86 commented 5 years ago

@Cherry Great. Once I update ACME.js you'll get dnsZone and dnsPrefix, which supplant dnsHost. With that you'll be able to remove the extraneous fetches.

Then the list of zones will be fetched only once per certificate rather than 2 * num-domains times.

Cherry commented 5 years ago

@solderjs Fantastic. I'll keep an eye out for that update. 👍

coolaj86 commented 5 years ago

I've published the new versions after testing various combinations with acme-dns-01-digitalocean. I fixed a regression as well.

If in approveDomains you set opts.challenges = { 'dns-01': whateverDns01 };, now it will actually overwrite the defaults and work as expected.

• https://git.coolaj86.com/coolaj86/acme-v2.js/pulls/25 • https://git.coolaj86.com/coolaj86/greenlock.js/pulls/40

These don't fix the out-of-order cache bug when simultaneously setting a wildcard and bare domain, but that's coming up next on my list real soon.

coolaj86 commented 5 years ago

Actually...

Screen Shot 2019-06-14 at 3 49 10 AM

It is working for me with the latest versions with Digital Ocean DNS. I expect it should be working for CloudFlare as well. I'd be interested to hear your experiences.

Cherry commented 5 years ago

Nice!

Unless I'm missing something though, I still appear to be seeing an out-of-order bug, where I'm seeing the correct data for a split-second, before it's being overridden and set to a different value, preventing the challenge from ever validating. This seems to be preventing acme-dns-01-test from passing, but I'm able to generate certs for non just the bare domain and subdomains without issue.

coolaj86 commented 5 years ago

Okay, I don't read BabelScript very well, but it looks like we've got two easy-to-spot problems:

async zones(args, callback)

We're doing a mix and match with the Promise API and the callback API. Probably not a good idea.

We should have just zones(args) and return a value or a Promise. I don't know what behavior to expect when using both, but it wasn't designed to do that.

// update existing record

https://github.com/nodecraft/acme-dns-01-cloudflare/blob/master/index.js#L43

It looks like we're fetching the list of TXT record and setting over the same one. That's definitely going to cause problems because *.example.com and example.com must use the same TXT record host, but must have a separate record for each.

Cherry commented 5 years ago

Thanks for the quick review.

I definitely do plan to have everything return promises rather than mix promises and callbacks. I was modelling code from other libraries and haven't gone back and tidied everything up yet. It doesn't appear to cause issues, but this is definitely on my to-do list.

That's definitely going to cause problems because *.example.com and example.com must use the same TXT record host, but must have a separate record for each.

Oh, I didn't realise that. I thought there'd never be multiple records by the same name, and always be 2 records, _4c-acme-challenge-03.*.example.com and _4c-acme-challenge-03.example.com for example?

coolaj86 commented 5 years ago
_4c-acme-challenge-03.*.example.com

Is not a valid TXT record because * is a special character.

They could have chosen something like _acme-challenge.<rand>.example.com, but instead they chose to do _acme-challenge.example.com for wildcard records.

Cherry commented 5 years ago

Right, okay, that makes sense. I'll spend some time testing and seeing what I can do to get this working. Thanks!

coolaj86 commented 5 years ago

I was modeling code from other libraries

Using cb is valid and will continue to be around in the future.

I use cb in things like the CLI example because it's dealing with stdin stream, which is annoying to promisify - and I find new Promise() to be error prone, such as when accidentally returning another promise rather than calling resolve(), for example.

So cb is a good thing to use, but only when it is, y'know?

Here's another implementation, for reference:

Cherry commented 5 years ago

Thanks for all the help @solderjs.

I've just committed some work towards a 1.0.0 release, including the fixes discussed here. It seems to almost complete, but then fails in acme-challenge-test at:

TypeError: Cannot read property 'dnsAuthorization' of null
    at F:\GitHub\acme-dns-01-cloudflare\node_modules\acme-challenge-test\index.js:175:31

Looking at https://git.rootprojects.org/root/acme-challenge-test.js/src/branch/master/index.js#L175, it appears that opts.challenge (ch) is somehow null here?

coolaj86 commented 5 years ago

Off-by-one error.

Perhaps you put a console.log() in your local copy or perhaps the code in the repo is now one line longer than the copy from npm.

              secret = secret.dnsAuthorization || secret;

secret is null.

Cherry commented 5 years ago

Aha, sorry. I think I've found the issue on that one, and everything is passing now! I've released 1.0.0 😃

@RevolvingDCON Could you go ahead and try again using v1.0.0 when you get a chance?

RevolvingDCON commented 5 years ago

Been following this thread all day. I just ran my tests with multiple configurations / domain types. it failed once out of the 50 tests and that error was due to requesting too many certificates.

Really great work guys! Will be excited to use this in production. This issue is resolved with the expected results.

Cherry commented 5 years ago

Nice, I'm glad to hear it's working well for you. 👍

Let's Encrypt definitely have some rate limits in place, and a max of 300 new orders per account per 3 hours. With DNS, there's also always a chance that things won't propagate in time, but I think it's safe to call this resolved now!

Thanks for your patience @RevolvingDCON and feel free to re-open or create a new issue if you run into anything else!