mailcow / mailcow-dockerized

mailcow: dockerized - 🐮 + 🐋 = 💕
https://mailcow.email
GNU General Public License v3.0
8.92k stars 1.17k forks source link

ACME fails to generate certificate whenever any of the domains is improperly configured #1308

Closed alexandernst closed 6 years ago

alexandernst commented 6 years ago

This shouldn't happen, as one misconfigured domain shouldn't affect the rest of the domains.

andryyy commented 6 years ago

It checks for A records, it does not protect from AAAA misconfigurations yet.

This is a duplicate of another issue where we discussed it it.

alexandernst commented 6 years ago

It actually fails trying to do the ACME challenge. The domain is properly configured at DNS level, but unproperly configured at server level.

Anyways, the issue still stand. Mailcow should be able to successfully generate certs on a domain basis, even if one of the domains is not configured. I'm think issue is actually blocking the next issue that I reported (#1309).

Also, what is the issue that you're referring to?

mkuron commented 6 years ago

What do you mean when you say

unproperly configured at server level

andryyy commented 6 years ago

You should check your ipv6 configuration. That’s the most common mistake. I don’t know what you mean with server level. Please also provide logs.

We do check for A records and it will skip those domains.

alexandernst commented 6 years ago

@andryyy This is the last part of the log (the part when ACME fails):

challenge acme-client: /var/www/acme/ZDfJY9Wq2IJNicpm p3suNdtIEe6mBuv cyeMwMPg0I: 
created acme-client: https://acme-v01.api.letsencrypt.org/acme/challenge/iUOuLDIqIJd7nzr31nt6f2 q0 
tZ0mX2dlWyDlZDPcE/4330849481: challenge acme-client: /var/www/acme/AWJONuIVx94yGF4V8iDSPNQ56di2aO1iBXgqyW6tBvw: 
created acme-client: https://acme-v01.api.letsencrypt.org/acme/challenge/PPnMPlFJ 
yICdQK1BodOMlJmteY-ijOdXsYEqrqPbao/4330849504: challenge acme-client: 
/var/www/acme/c33iIgRXigBtId-oBtSADbVziM TAEGusjC2I2c9Nv8: created acme-client: 
https://acme-v01.api.letsencrypt.org/acme/challenge/4L-X40tFzIUrWZCy-UmkbLqMVgZ29qDpmcLfeSHFBlY/4330849544: 
challenge acme-client: /var/www/acme/pte8ocOrPJsgh4ohcFDvHKmJE 7wfUPlCn1J q50Ok: created acme-client: 
https://acme-v01.api.letsencrypt.org/acme/challenge/He18Io9mbpPBzDZnQamW-e0pLaATFMeONfmFcbBUGv4/4330849589: 
challenge acme-client: /var/www/acme/9H5zLijmIh8M4VsqsHk5bWYFXq3OPVEX2G24c41m t8: created acme-client: 
https://acme-v01.api.letsencrypt.org/acme/challenge/un6IOsCvV7kLQa0ImrfjsTcEGOR8HNtDU3dupJikNpQ/4330849647: 
challenge acme-client: https://acme-v01.api.letsencrypt.org/acme/challenge/xK9c-Nno8QfMq83mGbWXFV0wQSgrlRtP-jxrCT7SIR4/4330846818: 
status acme-client: https://acme-v01.api.letsencrypt.org/acme/challenge/hhWTTTSuRKb31GKV9BE3Tp6ji9iFTtzuO794cy 
9erI/4339179787: status acme-client: https://acme-v01.api.letsencrypt.org/acme/challenge/hhWTTTSuRKb31GKV9BE3Tp6ji9iFTtzuO794cy 
9erI/4339179787: bad response acme-client: transfer buffer: type : http-01 , status : invalid , 
error : type : urn:acme:error:unauthorized , detail : Invalid response from 
http://autoconfig.activacionmuscular.com/.well-known/acme-challenge/erCdI2Iqhb-j-tFp9-gjX59XTmdyttTv-J3ojQtKt4s 
151.80.35.149 : 503 , status : 403 }, uri : https://acme-v01.api.letsencrypt.org/acme/challenge/hhWTTTSuRKb31GKV9BE3Tp6ji9iFTtzuO794cy 
9erI/4339179787 , token : erCdI2Iqhb-j-tFp9-gjX59XTmdyttTv-J3ojQtKt4s , keyAuthorization :
erCdI2Iqhb-j-tFp9-gjX59XTmdyttTv-J3ojQtKt4s.x-rroXeBTwvpNPAUb8FBwfd41T0Sy7pOuNPX12V7B1Q , validationRecord : 
url : http://autoconfig.activacionmuscular.com/.well-known/acme-challenge/erCdI2Iqhb-j-tFp9-gjX59XTmdyttTv-J3ojQtKt4s , 
hostname : autoconfig.activacionmuscular.com , port : 80 , addressesResolved : 151.80.35.149 , addressUsed : 151.80.35.149 } 
} (930 bytes) acme-client: bad exit: netproc(3662): 1

If I understand correctly, the process fails because the HAProxy in my server isn't configured properly to handle autoconfig.activacionmuscular.com (it returns a 503).

mkuron commented 6 years ago

If I understand correctly, the process fails because the HAProxy in my server isn't configured properly to handle autoconfig.activacionmuscular.com (it returns a 503).

Yes, that makes sense. But there is no way for Mailcow to know or check what the reverse proxy is configured to. And as long as the acme client and protocol don't support skipping failed domains, there is nothing we can do.

alexandernst commented 6 years ago

@mkuron What you can do is what I suggested in the previous issue I opened. You already run 16 (literally, 16) containers (rspamd, acme, nginx:alpine, fail2ban, phpfpm, memcached:alpine, postfix, clamd, ipv6nat, unbound, redis:alpine, dovecot, dockerapi, mariadb, watchdog, sogo). Why can't you run one more and just handle the entire cert creation domain by domain? If you did so, even if one of the domains failed, you'd still successfully generate the rest. And you'd remove the limit of 49 domains supported by the cert bot.

andryyy commented 6 years ago

You still don't understand that these are applications and not virtual machines.

If you want to include it, just be fucking nice and create a PR for this functionality.

mkuron commented 6 years ago

Let's Encrypt has per-IP rate limits. You likely wouldn't even be able to request individual certificates for all your domains.

Also, you would need to create virtual hosts in nginx for each certificate, and I'm not sure whether that can easily be automated.

Besides, this might solve your specific problem, but I'm sure there are other whitelabeling scenarios where it doesn't help. Since you are using a reverse proxy already, I don't see why you don't just generate the certificates there. You don't have to use the acme container.

Feel free to submit a pull request to solve the issue. You don't need a separate container for that, judt some changes to some scripts in the existing acme and nginx ones. We can't guarantee it will be merged, but we can certainly discuss it.

alexandernst commented 6 years ago

@andryyy Where have I said they are virtual machines? They are containers. I know what a container is. And I know how it works.

And you keep saying that mailcow can'thandle per-domain certs.

And my question is: is there any actual technical reason that prevents mailcow from being able to handle this, or is it just because of (lack of) man/hours? Said with other words. Will one more container (with the adequate code running inside it) fix the problem, or is there any technical reason why this can't be done from "inside" mailcow?

andryyy commented 6 years ago

The IP rate-limit is one of the problems, yes. It is easiest to use reverse proxies and even use multiple source IPs, I guess. Caddy might be a very easy option to handle autodiscover.* names. We could add an option to skip creation of those names completely.

@mkuron Not sure if we should drop autoconfig, what do you think? And thanks for all your great help, this is reeeeally appreciated!!

alexandernst commented 6 years ago

Well, the rate limit is not actually a problem. There is a limit of 20 requests per second on some API endpoints.

But even making a single request for a single (different) domain each minute, the life of each certificate is 90 days, which means that by the time the first domain in my list of domains reaches expiration, I could have requested 60 24 90 = 129600 certificates for 129600 different domains.

Read carefully how actually the rate limit applies: https://letsencrypt.org/docs/rate-limits/.

andryyy commented 6 years ago

No, you should recheck that.

Am 23.04.2018 um 14:32 schrieb Alexander Nestorov notifications@github.com:

Well, the rate limit is not actually a problem. There is a limit of 20 requests per second on some API endpoints.

But even making a single request for a single (different) domain each minute, the life of each certificate is 90 days, which means that by the time the first domain in my list of domains reaches expiration, I could have requested 60 24 90 = 129600 certificates for 129600 different domains.

Read carefully how actually the rate limit applies: https://letsencrypt.org/docs/rate-limits/.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

alexandernst commented 6 years ago
The main limit is Certificates per Registered Domain, (20 per week). A registered domain is, generally speaking, the part of the domain you purchased from your domain name registrar.

20 certificates per domain, per week.

There are no limits on the actual number of certificates for different domains you can request. (well, there are, like I said, there is a 20 requests per second limit on some of the API endpoints).

andryyy commented 6 years ago

You should really check the other issues and the ip/account limits. We discussed it. Using multiple accounts is a way to bypass some problems, but we would still need different IPs per account then.

...and read carefully how actually the rate limit applies. It is not like we can handle 129600 certificates for 129600 domains magically by using a reverse proxy. And please be nice...

mkuron commented 6 years ago

@andryyy, we should keep autoconfig.*. It's used by Thunderbird and a few other clients.

andryyy commented 6 years ago

Hm, but also in the certificate?

mkuron commented 6 years ago

Good point, I never checked whether the requests are made via HTTPS. But since we redirect every HTTP request to HTTPS, we should just stick with it for consistency.