gitpod-io / gitpod

The developer platform for on-demand cloud development environments to create software faster and more securely.
https://www.gitpod.io
GNU Affero General Public License v3.0
12.72k stars 1.21k forks source link

Tailscale DNS does not work #13785

Closed nVitius closed 1 year ago

nVitius commented 1 year ago

Bug description

Following the documentation on https://www.gitpod.io/docs/integrations/tailscale leads to a Tailscale installation that does not work with Tailscale's DNS.

There's some more history on that problem in this issue:

Also an open issue on Tailscale to ignore failures w/ ipv6 setup: https://github.com/tailscale/tailscale/issues/3002

The issue was closed by this PR: https://github.com/gitpod-io/demo-tailscale-with-gitpod/pull/7 Which adds some configuration to the Docker image so that ip6tables works. Important to note that the Tailscale integration docs on the website weren't updated to include this change.

I've added the fix from the PR to my custom Docker image, but the DNS issues seem to persist. This time, /etc/resolv.conf is updated, but any DNS requests (to internal or external services) time out. This error is seen in the tailscaled logs:

dns udp query: context deadline exceeded

Communication to the Tailnet seems to be intact. I can curl an internal service directly with a private IP address. Also, DNS lookups with dig or nslookup work with public DNS when Tailscale is still running (i.e. nslookup google.com 8.8.8.8).

I also tried all the same steps but using the demo from https://github.com/gitpod-io/demo-tailscale-with-gitpod and got the same results.

@DentonGentry Pinging you as you were active on the other relevant issues. Thanks for your work on this so far.

Steps to reproduce

Workspace affected

No response

Expected behavior

Tailscale integration should support DNS.

Example repository

https://github.com/gitpod-io/demo-tailscale-with-gitpod

Anything else?

No response

nVitius commented 1 year ago

For what it's worth, I also tried updating to tailscale 1.30.2 and saw the same issue.

mrzarquon commented 1 year ago

@nVitius it appears it was a tailscale issue, I ran into this also and updated to 1.32 (released just after this issue opened) and appears to work again:

gitpod /workspace/testing (mrz/add_ie) $ tailscale --version
1.32.0
  tailscale commit: c729f53f8786675f8b32cc1026b990dafed6bb24
  other commit: 2240b50035b36798992a70441b3d1ab3c5b8c7f2
  go version: go1.19.2-ts3fd24dee31
gitpod /workspace/testing (mrz/add_ie) $ tailscale status | grep aws-us-east-2-1
100.77.94.130   aws-us-east-2-1      tagged-devices linux   active; direct 18.116.29.156:41641, tx 12712 rx 13848
gitpod /workspace/testing (mrz/add_ie) $ ping aws-us-east-2-1
PING aws-us-east-2-1.starling-tiyanki.ts.net (100.77.94.130) 56(84) bytes of data.
64 bytes from aws-us-east-2-1.starling-tiyanki.ts.net (100.77.94.130): icmp_seq=1 ttl=255 time=91.6 ms
gitpod /workspace/testing (mrz/add_ie) $ ssh ec2-user@aws-us-east-2-1
Last login: Fri Oct 14 14:03:32 from 100.102.224.231
[ec2-user@ip-10-4-102-17 ~]$ 
nVitius commented 1 year ago

@mrzarquon I tried with 1.32.0 but the DNS issues persist for me. Not sure about SSH, I haven't tried that use-case at all.

mrzarquon commented 1 year ago

@nVitius can you try launching this workspace and see if it works with your tailscale account -

https://github.com/mrzarquon/gitpod-gpg

You may need to force a rebuild of the repo you're using. If you were using the above example repo, then this url would launch a new workspace after rebuilding the image:

https://gitpod.io/#imagebuild/https://github.com/mrzarquon/gitpod-gpg

nVitius commented 1 year ago

@mrzarquon I tried with a workspace from that repo but wasn't able to get it to work either.

I got some more info from our devops guy on how Tailscale is configured. It looks like we have MagicDNS disabled, but we have a couple Global Nameservers set up and the "Override local DNS" flag turned on. The global nameservers configured are internal ones hosted on AWS with a 10.0.*.* ip.

We also have an exit node set up, but I can't try that out on Gitpod. There's an outstanding issue for supporting Tailscale exit nodes: https://github.com/gitpod-io/gitpod/issues/8778

I'm starting to think this might be a Tailscale configuration issue. But then again, all other devices on the tailnet are able to query DNS with no problems (with or without an exit node).

kylos101 commented 1 year ago

:wave: @nVitius I just tried to reproduce steps in this issue, but, I was able to ping google.com (it resolved and I got replies).

gitpod /workspace/demo-tailscale-with-gitpod (main) $ hostname
gitpodsampl-demotailsca-5zsu424heig

gitpod /workspace/demo-tailscale-with-gitpod (main) $ tailscale status
100.81.7.42     gitpodsampl-demotailsca-5zsu424heig kyle@        linux   -
100.101.102.103 hello.ts.net         services@    linux   -
100.95.17.71    tailscale-germany    tagged-devices linux   idle; offers exit node
100.108.97.110  tailscale-india      tagged-devices linux   idle; offers exit node
100.82.242.38   tailscale-switzerland tagged-devices linux   idle; offers exit node
100.87.179.24   tailscale-uk         tagged-devices linux   idle; offers exit node
100.117.89.3    tailscale-usa        tagged-devices linux   idle; offers exit node

gitpod /workspace/demo-tailscale-with-gitpod (main) $ ping google.com
PING google.com (108.177.98.113) 56(84) bytes of data.
64 bytes from pj-in-f113.1e100.net (108.177.98.113): icmp_seq=1 ttl=113 time=0.721 ms
64 bytes from pj-in-f113.1e100.net (108.177.98.113): icmp_seq=2 ttl=113 time=0.424 ms

Regarding: But I did not have MagicDNS disabled and set a couple Global Nameservers up and "Override local DNS" flag turned on.

I added those as steps to recreate in the issue description, and will queue this issue for our team to investigate. More updates to follow!

nVitius commented 1 year ago

Thanks, @kylos101

Let me know if there's anything else I can do to help out with the investigation.

I had to set aside the work I was doing with gitpod, but I hope to pick it up again this week. My intention is to play around with our tailscale configuration to see if I can pinpoint what is causing the issue.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.