juanfont / headscale

An open source, self-hosted implementation of the Tailscale control server
BSD 3-Clause "New" or "Revised" License
22.25k stars 1.24k forks source link

[Feature] Support tailscale serve #1921

Open teleclimber opened 4 months ago

teleclimber commented 4 months ago

Use case

Tailscale serve is very useful for exposing a server in your tailnet. For those of us who use Tailscale to expose servers either privately with other users or globally using Funnel, this feature is borderline magical. I'd love to see Headscale support it.

Description

A complete description of the ts serve is here: https://tailscale.com/kb/1242/tailscale-serve

Contribution

How can it be implemented?

Honestly I don't know how much is involved here, but I'm willing to try and have a look.

teleclimber commented 4 months ago

Some clarifications:

$ tailscale serve --bg --http 80 http://localhost:3003

Works as expected. It gives me a http URL of the form http://<my-machine>.<my-username>.<my headscale-domain> that I can punch into my browser and that gets me a response from the small server I have running locally on :3003.

Where it goes wrong is if I don't include the --http 80, it defaults to https, and that's where the tailscale CLI prints this error:

error enabling https feature: error 404 Not Found: 404 page not found:

So basically it's the https part that I want to try to enable. Any hints on where to start would be greatly appreciated.

teleclimber commented 4 months ago

I spent some time going over the Tailscale client code to see what needs to happen.

Since the serve feature already works for HTTP, the missing piece mostly involves getting and using a TLS certificate for the right domain.

It is clear from the docs and the code that Tailscale fully expects to be involved in provisioning a certificate for that node. See https://tailscale.com/kb/1153/enabling-https

Additional fact: DNS-01 is the only LetsEncrypt challenge that the tailscale client can solve. See this line.

The following options are ruled out unless Tailscale make changes to their clients:

With that out of the way the only path forwards is to have Headscale implement DNS-01. I know of two approaches to this:

I'd be interested to know maintainer's thoughts on this at this point. Thanks.

Hypnotist1148 commented 3 months ago

This would also be the first step to have stuff like funnel working!

bentemple commented 1 month ago

I fully support this. Tried to setup a serve for a client yesterday, as I wanted to use a tailscale sidecar to expose a service, but I can't connect to it except through http. Would be so cool if it would work with https. Especially since I'm using a domain name with hsts so I have no choice but to connect via IP to use HTTP (I'd much prefer magicDNS of course)

ananthb commented 1 month ago

I can pitch in code for this as well @teleclimber. Would love to see this feature on Headscale.

pavanbuzz commented 1 month ago

I spent some time going over the Tailscale client code to see what needs to happen.

Since the serve feature already works for HTTP, the missing piece mostly involves getting and using a TLS certificate for the right domain.

It is clear from the docs and the code that Tailscale fully expects to be involved in provisioning a certificate for that node. See https://tailscale.com/kb/1153/enabling-https

Additional fact: DNS-01 is the only LetsEncrypt challenge that the tailscale client can solve. See this line.

The following options are ruled out unless Tailscale make changes to their clients:

  • Using a wildcard certificate is not possible. There is currently no way to tell tailscale serve to use that cert, AFAIK.
  • Doing an HTTP-01 challenge, which would be easier to implement than DNS-01, is not possible unless that challenge is implemented on the client side too.

With that out of the way the only path forwards is to have Headscale implement DNS-01. I know of two approaches to this:

  • Make API calls to DNS name servers to set records as needed. Thanks to Caddy server there is precedent and plenty of Go Code for this.
  • Embed something like acme-dns into headscale.

I'd be interested to know maintainer's thoughts on this at this point. Thanks.

@teleclimber Tracing it further, tailscale already creates the acme challenge record (key,value). It then calls the control plane to SetDNS at #L472.

If we trace this call, noiseClient sends a POST request to the control server at api endpoint /machine/set-dns with NodeKey and Body SetDNSRequest.

So if we handle this endpoint in Headscale by adding a new TXT record into the corresponding provider (like cloudflare, digitaloceans, etc). Then tailscale serve would be able to obtain a TLS certificate.

We need a way to let users configure their dns provider. Traefik uses environment variables/secret files to configure. Also they use a library like lego to handle DNS challenge. But in this case, we just need to add a new record to the provider.

ananthb commented 1 month ago

@pavanbuzz since the new beta release changes Node Magic DNS names to <node>.<your-domain> instead of <node>.<user>.<your-domain>, we could also solve HTTP-01 or TLS-ALPN-01 challenges.

Users can point *. to their headscale instance via DNS.

The advantage being that we don't need to support upstream DNS APIs.

pavanbuzz commented 1 month ago

@pavanbuzz since the new beta release changes Node Magic DNS names to <node>.<your-domain> instead of <node>.<user>.<your-domain>, we could also solve HTTP-01 or TLS-ALPN-01 challenges.

@ananthb I understand. I think the recent beta changes for Magic DNS to <node>.<your-domain> will be similar to the one tailscale offers <node>.<ts-name>.ts.net. Headscale already handles HTTP-01 or TLS-ALPN-01 challenges to get a TLS certificate for headscale instance.

Users can point *. to their headscale instance via DNS.

The advantage being that we don't need to support upstream DNS APIs.

It's definitely less maintenance. Does it mean then headscale stores all the certificates/keys? How will the node that is requesting a TLS be able to get a certificate then? This approach is similar to caddy-tailscale (Tailscale plugin for caddy). Please correct me, if am mistaken.

Tailscale cli/api requests a TLS certificate on the node where serve/funnel is invoked and saves the certificate/key in that node. In order to get tailscale serve work natively with TLS, Headscale should handle /machine/set-dns endpoint and create a corresponding dns entry in the authoritative dns server for that domain. I believe this is how Tailscale is accomplishing this feature.

This functionality can be extended to accomplish Funnel feature, but it requires atleast one node in the tailnet which can receive public traffic and do a tcp routing based on server name indication. This way TLS traffic is not decrypted by that public facing node.

ananthb commented 1 month ago

@pavanbuzz I get it now. With the DNS challenge, the node requesting the cert can fetch it directly from an ACME issuer.

Letting the node handle its own secret material is infinitely better.

ananthb commented 1 month ago

As @teleclimber pointed out earlier, we could embed a DNS server inside the headscale server and make it authoritative for a domain.

pavanbuzz commented 1 month ago

As @teleclimber pointed out earlier, we could embed a DNS server inside the headscale server and make it authoritative for a domain.

@ananthb That is a good idea!

I am leaning towards using other provider for DNS due to several reasons (including but not limited to),

I found that using lego's Challenge.PreSolve will be able to add a TXT record. Since the ts-node is going to check for propagation and obtaining a certificate, headscale server has to just insert the record and return a response.

Because lego has support for multiple dns providers, it allows users to choose any DNS servers (including custom DNS servers). Values (or file reference) can be passed-in via Environment variables. More info can be found in their docs. We can take a similar approach to Traefik's dnsChallenge (although, headscale will not obtain the certificate, but just add DNS record and clean it up later).

ananthb commented 1 month ago

Leaning on lego for challenge providers sounds promising.

pavanbuzz commented 1 month ago

I found that using lego's Challenge.PreSolve will be able to add a TXT record. Since the ts-node is going to check for propagation and obtaining a certificate, headscale server has to just insert the record and return a response.

I did a bit more analysis and found it shouldn't be Challenge.PreSolve, but instead Provider.Present. However there is an issue using Present method. Based on many provider's implementation, it hashes the value before adding an entry into the dns provider. Tailscale client already sends a hashed value (solved the challenge). So this won't work. There is an issue opened in their repo to allow adding the value without hashing. Until this functionality is added, we cannot use lego to achieve this feature.

Libdns seems promising and can do what we require. But they are still WIP. Would it be okay to add that dependency in headscale?

@ananthb Your idea of DNS server would be better solution as well. Do you have any suggestion for embedded dns? This would require a detailed instruction of how to get headscale to become an authoritative server for a subdomain.

Any idea how do we proceed?

ananthb commented 1 month ago

It was @teleclimber's idea to embed an authoritative DNS server in headscale. They've even linked to one we can use.

But, the more I think about it, the less this sounds like a good idea. Headscale cannot be hosted in a high availability configuration, so any DNS server hosted by it will suffer from reliability issues.

That Lego issue is moving frustratingly slowly for what should be a straightforward change.

I haven't looked at libdns in depth, but it looks promising.

We should be able to write our own interface similar to the lego one that we are unable to use currently.

I'd strongly recommend against hosting our own DNS. There be dragons.

pavanbuzz commented 1 month ago

@teleclimber I think your idea to use libdns would be the way to go. I can help coding this feature. Kindly let me know if you have started working on this. Would be happy to help.

teleclimber commented 1 month ago

Hi everyone, I'm happy to see some enthusiasm for adding this feature to headscale. Thanks for all the comments.

One concern I have about setting records on a third party DNS provider like cloudflare etc.. is that many of them do not offer granular control over what the API key allows. On Porkbun (one that I have used) it's all or nothing. If I make an API key and allow it for the domain, that API key can be used to change the A records. Not great. Cloudflare appears to be the same, according to this: https://developers.cloudflare.com/fundamentals/api/reference/permissions/#zone-permissions

Some DNS providers don't even offer any API access to change DNS records. So going with 3rd party nameservers implies that Headscale users may have to change their domain's nameservers to use one that has an API. Hopefully that's not too high a burden.

Of course, if headscale is the authoritative nameserver, then all the same issues apply: a security issue in Headscale could allow someone to change your A record. And of course you have to change the nameserver to your headscale. So it's the same burden in the end.

Between the two options I think setting records on a third party DNS authoritative name server is easier to implement, easier to set up for the user, and likely more reliable. The risks in case of a security issue are about the same.

@pavanbuzz I haven't started working on this yet. If we agree that we should go with 3rd party and something like libdns, I would like to hear from maintainers whether they would accept libdns as a dependency. Also I need to dig into libdns and headscale code a lot more.

pavanbuzz commented 1 month ago

Hi everyone, I'm happy to see some enthusiasm for adding this feature to headscale. Thanks for all the comments.

One concern I have about setting records on a third party DNS provider like cloudflare etc.. is that many of them do not offer granular control over what the API key allows. On Porkbun (one that I have used) it's all or nothing. If I make an API key and allow it for the domain, that API key can be used to change the A records. Not great. Cloudflare appears to be the same, according to this: https://developers.cloudflare.com/fundamentals/api/reference/permissions/#zone-permissions

@teleclimber - Thanks for this info. I didn't know that other providers did not provide granular control. Though Cloudflare provides creation of api token for specific resource (domain). This token can edit DNS records only for this zone (step-6 gives instruction as to how to select this). I am using this setup currently.

But you are right, its a security implication that needs to be carefully considered.

Of course, if headscale is the authoritative nameserver, then all the same issues apply: a security issue in Headscale could allow someone to change your A record. And of course you have to change the nameserver to your headscale. So it's the same burden in the end.

I don't think it will be an issue. We could design a solution similar to joohoi/acme-dns, by using a DNS server as an authority server for a subdomain, instead of the main domain. This way, only DNS requests for this subdomain and subdomains of this subdomain will be served. This is achieved by creating a NS record for the subdomain on the main DNS provider along with a A record that points to the DNS server (info provided in dns-records section of joohoi/acme-dns).

Note - We might not be able to use joohoi/acme-dns. It requires a way to add a CNAME redirection (ACME magic) into the main DNS provider for each challenge (goes back to the same issue of security).

Let me try to explain the above logic with an example. If the main domain is example.com & hs.example.com is the subdomain. This new DNS server will become an authoritative server for hs.example.com & *.hs.example.com. If this server is ever compromised, impact is restricted only to the hs.example.com and *.hs.example.com.

Note - I think remediation could also be as easy as users logging into their main DNS provider and disabling the NS & A record for this subdomain.

There are two main hurdles for users that I can think of.

  1. Users have to change their dns_config.base_domain to a subdomain like hs.example.com. So all the hostnames for magic_dns will become myhost.hs.example.com.
  2. Create a NS & A record (one-time setup) in their main DNS provider.

There are few of things for implementing this feature - We require a dns server that provides a mechanism (API/RPC/etc) to create/update/delete (TXT,CNAME,A,AAAA) records.

Between the two options I think setting records on a third party DNS authoritative name server is easier to implement, easier to set up for the user, and likely more reliable. The risks in case of a security issue are about the same.

I tried libdns, and its actually pretty easy. Though the custom dns server is more secure, it also means more work.

@teleclimber / @ananthb - let me know what do think. Hope I didn't confuse.

ananthb commented 1 month ago

As to the question of DNS zone security, the blast radius is the same whether headscale can manipulate a third-party hosted zone or whether its hosting the zone.

Self-hosting reliable DNS means at least two servers for failover and a whole other can of worms besides.

My vote is resoundingly for third-party DNS server support.

teleclimber commented 1 month ago

Let me try to explain the above logic with an example. If the main domain is example.com & hs.example.com is the subdomain. This new DNS server will become an authoritative server for hs.example.com & .hs.example.com. If this server is ever compromised, impact is restricted only to the hs.example.com and .hs.example.com.

Yes this is how I was imagining we would do things. I may have been too loose with terminology, using "domain" instead of subdomain and zone. Sorry for the confusion.

My vote is resoundingly for third-party DNS server support.

Yes I think that's where I'm at as well.

Note that nothing prevents headscale from supporting other options down the line.

mitchellkellett commented 1 month ago

I've been quietly following this in the background. I've previously taken a look at jsiebens/ionscale, and I can see that they are using libdns for their implementation of Serve. Looks like that might be the way to go for now at least.

pavanbuzz commented 1 month ago

If we all agree with libdns for now, should we involve the maintainers now? We can embed dns (with limited scope for acme challenge and funnel dns response) implemented later.

Sequence diagram with external DNS server using libdns ```mermaid %%{init: {'sequence': {'rightAngles': true}} }%% sequenceDiagram title TLS certificate flow with Cloudflare/Other DNS Provider participant node as ts-node participant hs as Headscale Server participant le as Let's Encrypt participant dns as Cloudflare DNS Server node->>+le: AuthorizeOrder with `DNS-01` Challenge le-->>-node: Challenge value node->>+hs: /machine/set-dns hs->>+dns: using libdns to save _acme-challenge.subdomain.example.com dns-->>-hs: Saved hs-->>-node: OK node->>+le: Challenge accepted le->>+dns: lookup _acme-challenge.subdomain.example.com dns-->>-le: TXT record le->>-le: validate response node->>+le: get status le-->>-node: challenge verified node->>+le: CreateOrderCert le->>-node: Certificate ```
Sequence diagram with embedded DNS server in Headscale (future implementation - if maintainers are okay) ```mermaid %%{init: {'sequence': {'rightAngles': true}} }%% sequenceDiagram title TLS certificate flow with Headscale as DNS server participant node as ts-node participant hs as Headscale Server participant le as Let's Encrypt node->>+le: AuthorizeOrder with `DNS-01` Challenge le-->>-node: Challenge value node->>+hs: /machine/set-dns hs->>hs: save _acme-challenge.subdomain.hs.example.com hs-->>-node: OK node->>+le: Challenge accepted critical DNS lookup on port 53/other port if redirected using rinetd le->>+hs: lookup _acme-challenge.subdomain.hs.example.com hs->>hs: fetch _acme-challenge.subdomain.hs.example.com from db hs-->>-le: TXT record end le->>-le: validate response node->>+le: get status le-->>-node: challenge verified node->>+le: CreateOrderCert le-->>-node: Certificate ```
teleclimber commented 1 month ago

Nice diagrams @pavanbuzz . You're well ahead of me on this, I haven't had much time to dive in. If you want to take the lead on this I wouldn't be offended.

pavanbuzz commented 1 month ago

Nice diagrams @pavanbuzz . You're well ahead of me on this, I haven't had much time to dive in. If you want to take the lead on this I wouldn't be offended.

@teleclimber thanks! This is my first experience with Go. So might take a bit longer, but i will get this up and running.

pavanbuzz commented 1 month ago

@juanfont / @kradalby - Our objective is to incorporate the tailscale serve feature into the Headscale server. To achieve this, @teleclimber has proposed two options detailed below:

Since the serve feature already works for HTTP, the missing piece mostly involves getting and using a TLS certificate for the right domain.

It is clear from the docs and the code that Tailscale fully expects to be involved in provisioning a certificate for that node. See https://tailscale.com/kb/1153/enabling-https

Additional fact: DNS-01 is the only LetsEncrypt challenge that the tailscale client can solve. See this line.

The following options are ruled out unless Tailscale make changes to their clients:

  • Using a wildcard certificate is not possible. There is currently no way to tell tailscale serve to use that cert, AFAIK.
  • Doing an HTTP-01 challenge, which would be easier to implement than DNS-01, is not possible unless that challenge is implemented on the client side too.

With that out of the way the only path forwards is to have Headscale implement DNS-01. I know of two approaches to this:

  • Make API calls to DNS name servers to set records as needed. Thanks to Caddy server there is precedent and plenty of Go Code for this.
  • Embed something like acme-dns into headscale.

I'd be interested to know maintainer's thoughts on this at this point. Thanks.

We believe that leveraging libdns would be the optimal approach, given its compatibility with various external DNS providers such as Cloudflare. This choice also sets the stage for the future integration of the Tailscale Funnel feature.

Corresponding sequence diagrams can be found here https://github.com/juanfont/headscale/issues/1921#issuecomment-2293382400 .

We would like to get your opinion so we can move forward with the implementation.

kradalby commented 1 month ago

I think serve is quite attainable, while funnel is less realistic, but happy for someone to work towards it.

I think the work should be split into dns+serve standalone, and then potentially funnel in the future.

My main concern with all user contributed code is outlined in our contribution guidelines.

I'm positive to someone contributing it, but we will not accept it if we find that it is likely going to cause us a large burden now that we have other things to do. We would eventually aim to get to this ourselves, but not sure when that would be.

Summarised, it needs to be:

pavanbuzz commented 1 month ago

I think serve is quite attainable, while funnel is less realistic, but happy for someone to work towards it.

I believe I have an idea on how to achieve this. Though this would require building a separate funnel ingress server like a derp server and should be self-hosted separately by users. Funnel can be dealt later once Serve is implemented.

I think the work should be split into dns+serve standalone, and then potentially funnel in the future.

I don't understand this part. Do you mean separate PRs for dns+serve standalone?

I'm positive to someone contributing it, but we will not accept it if we find that it is likely going to cause us a large burden now that we have other things to do. We would eventually aim to get to this ourselves, but not sure when that would be.

I understand the concern. And would stick to the contribution guidelines.

Summarised, it needs to be:

  • Very well tested (integration mostly for this I would assume)

I am not sure how we can test the part where the DNS records are updated. But i think other unit tests & integration tests for other things are doable.

  • Any external dependencies need to be vetted.

@kradalby This is where we would like your opinion as well , whether the PR would be accepted if we use libdns for updating DNS records.

kradalby commented 1 month ago

I don't understand this part. Do you mean separate PRs for dns+serve standalone?

Do what is needed for serve, and just dont start on funnel, I would be comfortable with giving a thumbs up for serve, but not funnel.

I am not sure how we can test the part where the DNS records are updated. But i think other unit tests & integration tests for other things are doable.

Yes, I think the logic of what and how it is set should be tested, but not necessarily the upstream.

@kradalby This is where we would like your opinion as well , whether the PR would be accepted if we use libdns for updating DNS records.

libdns looks fine, I think it is the one I looked at last time this came up.

A nice exercise for using libdns would be to replace/add to the current configuration and logic to set up headscale itself with HTTPS, the config is old and yankee and could use some love and nicer configuration.

ananthb commented 1 month ago

Funnel definitely needs more from the community than I think we can ask of it/ourselves for now.

I'm also comfortable pitching in on serve. @pavanbuzz we can work together on this if that works for you.

pavanbuzz commented 1 month ago

Funnel definitely needs more from the community than I think we can ask of it/ourselves for now.

I'm also comfortable pitching in on serve. @pavanbuzz we can work together on this if that works for you.

That would be great @ananthb, lets have a chat to see how we can split the work and get started!

ananthb commented 1 month ago

My email and matrix links are on my GitHub profile.

imft-debug commented 20 hours ago

I would also like to contribute on the issue as its quite good feature to serve https servers on opensource headscale server

ananthb commented 19 hours ago

@pavanbuzz do you want to get started?