caddyserver / caddy

Fast and extensible multi-platform HTTP/1-2-3 web server with automatic HTTPS
https://caddyserver.com
Apache License 2.0
57.4k stars 4k forks source link

caddytls: tailscale cert manager not used as fallback for *.ts.net certs #6327

Closed willnorris closed 4 months ago

willnorris commented 4 months ago

It appears that the tailscale cert manager is only used when a ts.net domain is explicitly configured for a site:

samson.tailb575b.ts.net:443 {
  respond OK
}

This works fine, and shows debug output of:

2024/05/19 01:44:25.371 DEBUG   tls.handshake   no matching certificates and no custom selection logic  {"identifier": "samson.tailb575b.ts.net"}
2024/05/19 01:44:25.371 DEBUG   tls.handshake   no matching certificates and no custom selection logic  {"identifier": "*.tailb575b.ts.net"}
2024/05/19 01:44:25.372 DEBUG   tls.handshake   no matching certificates and no custom selection logic  {"identifier": "*.*.ts.net"}
2024/05/19 01:44:25.372 DEBUG   tls.handshake   no matching certificates and no custom selection logic  {"identifier": "*.*.*.net"}
2024/05/19 01:44:25.372 DEBUG   tls.handshake   no matching certificates and no custom selection logic  {"identifier": "*.*.*.*"}
2024/05/19 01:44:38.957 DEBUG   tls.handshake   using externally-managed certificate    {"remote_ip": "100.120.137.94", "remote_port": "53195", "sni": "samson.tailb575b.ts.net", "names": ["samson.tailb575b.ts.net"], "expiration": "2024/08/17 00:44:38.000"}

However, when configured as:

:443 {
  respond OK
}

I get debug logs:

2024/05/19 01:48:42.921 DEBUG   tls.handshake   no matching certificates and no custom selection logic  {"identifier": "samson.tailb575b.ts.net"}
2024/05/19 01:48:42.921 DEBUG   tls.handshake   no matching certificates and no custom selection logic  {"identifier": "*.tailb575b.ts.net"}
2024/05/19 01:48:42.921 DEBUG   tls.handshake   no matching certificates and no custom selection logic  {"identifier": "*.*.ts.net"}
2024/05/19 01:48:42.921 DEBUG   tls.handshake   no matching certificates and no custom selection logic  {"identifier": "*.*.*.net"}
2024/05/19 01:48:42.921 DEBUG   tls.handshake   no matching certificates and no custom selection logic  {"identifier": "*.*.*.*"}
2024/05/19 01:48:42.921 DEBUG   tls.handshake   no certificate matching TLS ClientHello {"remote_ip": "100.120.137.94", "remote_port": "53237", "server_name": "samson.tailb575b.ts.net", "remote": "100.120.137.94:53237", "identifier": "samson.tailb575b.ts.net", "cipher_suites": [4867, 4866, 4865, 52393, 52392, 52394, 49200, 49196, 49192, 49188, 49172, 49162, 159, 107, 57, 65413, 196, 136, 129, 157, 61, 53, 192, 132, 49199, 49195, 49191, 49187, 49171, 49161, 158, 103, 51, 190, 69, 156, 60, 47, 186, 65, 49169, 49159, 5, 4, 49170, 49160, 22, 10, 255], "cert_cache_fill": 0, "load_or_obtain_if_necessary": true, "on_demand": false}
2024/05/19 01:48:42.922 DEBUG   http.stdlib     http: TLS handshake error from 100.120.137.94:53237: no certificate available for 'samson.tailb575b.ts.net'

Since caddy is already treating ts.net domains special when setting up cert managers, I'm wondering if it could do the same in the fallback handler (or maybe I've got some terminology wrong here?). I'm already looking into this myself, but wanted to open the issue in case there was a reason it behaves this way, or if you have any pointers on where to hook in.

willnorris commented 4 months ago

Okay, I think I have found some clues. From autohttps.go

// tailscale names go in their own automation policies because // they require on-demand TLS to be enabled, which we obviously // can't enable for everything

So configuring on-demand access for the site does seem to work...

{
  debug
  on_demand_tls {
    ask http://localhost:8080/
  }
}

:443 {
  tls {
    on_demand
  }
  respond OK
}

:8080 {
  respond OK
}

Do I have that right? I'm not too worried about requiring a proper hostname for production use, I'm just trying to add to the caddy-tailscale examples that will work without additional config changes

francislavoie commented 4 months ago

I think you need to do this:

https:// {
    tls {
        get_certificate tailscale
    }
    respond OK
}

https://caddyserver.com/docs/caddyfile/directives/tls#certificate-managers

We could probably adjust the docs to mention that this pattern can be used if you need wildcard TS domains :thinking:

willnorris commented 4 months ago

But even that still requires the on_demand_tls global option, right? Otherwise, you get the error:

Error: loading initial config: loading new config: loading http app module: provision http: on-demand TLS cannot be enabled without a permission module to prevent abuse; please refer to documentation for details

So then I'm either specifying get_certificate tailscale or on_demand on each site, so it's roughly equivalent.

francislavoie commented 4 months ago

I'm pretty sure enabling get_certificate should obviate the need for that. Are you testing with latest on master? I thought we adjusted that already :thinking:

willnorris commented 4 months ago

yes, this is with the last master branch... still see that error.

francislavoie commented 4 months ago

Okay - PRs welcome if you want to patch that (to skip the permission module check if get_certificate is used).

Basically a get_certificate module is effectively a permission module that returns a certificate or nil instead of a boolean. So I don't think it makes sense to ever require a permission module when get_certificate is used.

mholt commented 4 months ago

@willnorris Is this patched up now? If so I'll close this and prepare to release 2.8 rc.1.

willnorris commented 4 months ago

Yes, I think so. I just tested again, and while we still have some pending changes on the Tailscale side to merge (for things like proper QUIC support), everything on the Caddy side is done.

mholt commented 4 months ago

Excellent!! Thanks. I'll tag rc1 today.