caddyserver / caddy

Fast and extensible multi-platform HTTP/1-2-3 web server with automatic HTTPS
https://caddyserver.com
Apache License 2.0
55.71k stars 3.92k forks source link

Option to disable usage of "ACME Challenge TEMP" certificate to avoid client errors #2622

Closed bdr99 closed 5 years ago

bdr99 commented 5 years ago

1. What would you like to have changed?

I would like to have an option to stop Caddy from using the "ACME Challenge TEMP" certificate that it seems to be using during the process of initialization.

2. Why is this feature a useful, necessary, and/or important addition to this project?

I use Caddy as a reverse proxy to a Nextcloud instance. The Nextcloud desktop client installed on my PC communicates with my Nextcloud server instance through Caddy.

Whenever I restart Caddy, the Nextcloud desktop client gives me the error shown below. I assume this is because Caddy is using a temporary certificate while it is initializing, and replacing it with the actual certificate when it is obtained. It is undesirable to have this error message appear on all linked Nextcloud clients whenever Caddy or the server running Caddy is restarted. Why is Caddy serving an invalid certificate at all? I would appreciate it if Caddy had an option to disable the use of this temporary certificate, in order to prevent error messages at the client side. Thanks!

caddy2

3. What alternatives are there, or what are you doing in the meantime to work around the lack of this feature?

Only workaround is to close the Nextcloud certificate error when it appears.

4. Please link to any relevant issues, pull requests, or other discussions.

N/A

mholt commented 5 years ago

This comes from https://github.com/go-acme/lego/blob/3edb75872df36444f95c25f88c6df325de89be11/certcrypto/crypto.go#L244.

It is used to solve the ACME TLS-ALPN challenge, but it should only be served up when the client advertises the ACME-TLS/1 protocol in ALPN. If you have an HTTP request that we can use to reproduce this behavior, that would be helpful.

You can disable this challenge method if you know that the HTTP challenge will work. Or you can configure the DNS challenge.

bdr99 commented 5 years ago

Thanks for your reply. So I suppose it would be an issue with the ALPN performed by the Nextcloud desktop client then. Would it help if I did a packet capture with a tool like Wireshark?

mholt commented 5 years ago

Yeah, if you can capture the HTTP request that the client is making, that will help us know if it's a bug in Caddy or a very very odd behavior in Nextcloud.

bdr99 commented 5 years ago

OK thanks, I'll give that a try and let you know.

bdr99 commented 5 years ago

Hey @mholt, I managed to get a packet capture of Caddy serving the "ACME Challenge TEMP" certificate in response to a request from the Nextcloud Windows client. Hope this helps! The IP addresses have been changed for anonymity, but everything else should be intact. Here it is:

acme_temp.zip

mholt commented 5 years ago

@bdr99 That's very interesting, thanks. NextCloud client is not sending any ALPN in its ClientHello.

This appears to be happening because when Caddy starts, it defers to lego to obtain certificates before it starts serving. During that time, only lego has the listener open and the only certificate it serves is the ACME challenge cert for the TLS-ALPN challenge. Apparently it does not honor the client's ALPN extension and so it serves up that certificate to everyone. Even if it did, the alternative is to not serve any certificate at all.

So, I think this is all working as designed. Caddy can't serve the site over HTTPS until it gets a certificate to do so.

bdr99 commented 5 years ago

Thanks for looking into this! It's unfortunate that this behavior exists, since it causes this undesirable error message whenever Caddy has to renew the certificate. Is it really the correct behavior for Caddy to serve a non-verifiable certificate to a client that did not have the ALPN extension in its ClientHello? Do you think this is a bug in lego that needs to be fixed on their side?

whitestrake commented 5 years ago

I've been running Nextcloud behind Caddy for quite some time, never actually encountered this issue. Probably by luck and virtue of it only needing to occur for a few seconds every 60+ days. You say it happens every time you restart Caddy - are you preserving your certificates correctly, or are you getting a brand new certificate every time you start the server?

One workaround, if this is causing a lot of issues for you, is to use the -disable-tls-alpn-challenge flag on startup. This causes Caddy to use the HTTP challenge instead.

https://caddyserver.com/docs/cli

mholt commented 5 years ago

Is it really the correct behavior for Caddy to serve a non-verifiable certificate to a client that did not have the ALPN extension in its ClientHello?

It's lego doing it, not Caddy (once Caddy is running, Caddy takes over the ACME challenges and honors the client's ALPN). Arguably it's not correct behavior, but what should it do instead? Serve no certificate at all?

Do you think this is a bug in lego that needs to be fixed on their side?

I agree lego should require a proper ALPN value; feel free to open that issue upstream.

It's unfortunate that this behavior exists, since it causes this undesirable error message whenever Caddy has to renew the certificate.

Like @whitestrake said, this only happens while the server is not yet running and if the certificate does not exist or is expired/expiring. If the server is already running (e.g. SIGUSR1 to do a proper config reload or when gracefully renewing certs in the background), you won't see this behavior.

bdr99 commented 5 years ago

@whitestrake @mholt I should have mentioned this in my comment earlier, but you're correct, I have recently realized that it only happens if I restart Caddy when it has been a while since the certificate has been renewed. It doesn't happen if Caddy uses a cached certificate.

I'll open an issue upstream for lego. Thanks to both of you for the troubleshooting help!

By the way, is there a way to switch to the HTTP challenge only for certain endpoints? I am serving several sites with one Caddyfile and I would like to keep using the TLS challenge for all of them except Nextcloud.

whitestrake commented 5 years ago

By the way, is there a way to switch to the HTTP challenge only for certain endpoints?

It's an instance-wide flag. There's no site-specific equivalent.

The HTTP challenge is quite serviceable - is there a need for the TLS-ALPN challenge to occur on the other sites?

bdr99 commented 5 years ago

@whitestrake Not really, I was just curious. I'll probably switch to the HTTP challenge for all endpoints if the issue doesn't get fixed in lego.

mholt commented 5 years ago

@bdr99 Even if lego's behavior changes, you'll still get TLS connection errors until the certificate is obtained.

The better way to solve this would be to let Caddy stay running and use signal USR1 to reload the config gracefully, this incurs zero downtime.

mholt commented 5 years ago

By the way, is there a way to switch to the HTTP challenge only for certain endpoints?

Caddy 2 has this ability.