anasfanani / Magisk-Tailscaled

Magisk/KernelSU module for running Tailscale on rooted Android devices. The easiest, most secure way to use WireGuard and 2FA.
https://t.me/systembinsh/158
Other
220 stars 25 forks source link

Headscale - IPv6 / DNS Resolving Issues - Not Able to Authenticate at All #19

Open bioluks opened 6 months ago

bioluks commented 6 months ago

You can also check https://github.com/anasfanani/Magisk-Tailscaled/issues/14#issuecomment-2082351202 this is a repost of it since I wasn't able to reopen the issue. I'm seeing IPv6 addresses everywhere in the logs, my current location has no IPv6 support, so I'm getting the same errors, seeing lines like:

2024/04/28 21:41:53 control: bootstrapDNS("derp8b.tailscale.com", "2a03:b0c0:1:d0::ec1:e001") for "headscale.example.com" error: Get "https://derp8b.tailscale.com/bootstrap-dns?q=headscale.example.com": dial tcp [2a03:b0c0:1:d0::ec1:e001]:443: connect: network is unreachable
2024/04/28 21:41:53 [RATELIMIT] format("control: bootstrapDNS(%q, %q) for %q error: %v")
2024/04/28 21:41:55 Received error: fetch control key: Get "https://headscale.example.com/key?v=90": failed to resolve "headscale.example.com": no DNS fallback candidates remain for "headscale.example.com"
2024/04/28 21:42:12 [RATELIMIT] format("monitor: ip rule deleted; failed to parse netlink message: %v") (33 dropped)
2024/04/28 21:42:12 monitor: ip rule deleted; failed to parse netlink message: invalid route message attributes: netlink: attribute 20 is not a uint8; length: 8
2024/04/28 21:42:12 monitor: ip rule deleted; failed to parse netlink message: invalid route message attributes: netlink: attribute 20 is not a uint8; length: 8
2024/04/28 21:42:12 monitor: ip rule deleted; failed to parse netlink message: invalid route message attributes: netlink: attribute 20 is not a uint8; length: 8
2024/04/28 21:42:12 monitor: ip rule deleted; failed to parse netlink message: invalid route message attributes: netlink: attribute 20 is not a uint8; length: 8
2024/04/28 21:42:15 monitor: ip rule deleted; failed to parse netlink message: invalid route message attributes: netlink: attribute 20 is not a uint8; length: 8
2024/04/28 21:42:15 [RATELIMIT] format("monitor: ip rule deleted; failed to parse netlink message: %v")
2024/04/28 21:42:20 control: LoginInteractive -> regen=true
2024/04/28 21:42:20 control: doLogin(regen=true, hasUrl=false)
2024/04/28 21:42:20 [RATELIMIT] format("control: trying bootstrapDNS(%q, %q) for %q ...") (9 dropped)
2024/04/28 21:42:21 control: trying bootstrapDNS("derp1d.tailscale.com", "2604:a880:800:10::7fe:f001") for "headscale.example.com" ...
2024/04/28 21:42:21 [RATELIMIT] format("control: bootstrapDNS(%q, %q) for %q error: %v") (3 dropped)
2024/04/28 21:43:27 control: bootstrapDNS("derp8c.tailscale.com", "2a03:b0c0:1:d0::e1f:4001") for "headscale.example.com" error: Get "https://derp8c.tailscale.com/bootstrap-dns?q=headscale.example.com": dial tcp [2a03:b0c0:1:d0::e1f:4001]:443: connect: network is unreachable
2024/04/28 21:43:27 control: trying bootstrapDNS("derp1d.tailscale.com", "165.22.33.71") for "headscale.example.com" ...
2024/04/28 21:43:27 [RATELIMIT] format("control: trying bootstrapDNS(%q, %q) for %q ...")

headscale.example.com is for privacy of course. Even when having a hosts file entry in Android nothing seems to solve this issue. The derp servers used in the 'trying bootstrapDNS' lines are returning IPv6 addresses first, the IPv4 addresses come later... A picture from the bootstrapDNS request:

image

First I thought this is a tailscale issue on its own, but since this does not happen on desktop clients I thought we can think of a workaround for the Magisk/KSU module. Things coming to my mind I still have to test:

  1. Reversing the bootstrap DNS results, so the IPv4 addresses would be on top of the list (?)
  2. Hardcoding the DERP servers with their IP's temporarily into the hosts file systemlessly until we register the device as a node on headscale.
  3. Telling Tailscale with a commandline switch (if this exists of course) that it should prefer IPv4 over IPv6.
  4. Using a HTTPS_PROXY or HTTPS_PROXY until registered as a node. Tailscale has issues from time to time recognizing these environment variables. See related Reddit Post with the same issue I have.

These are probably not that logical and promising but it's better than not being able to register the device at all. Also a related GitHub issue on tailscale/tailscale.

anasfanani commented 6 months ago

Have you try this solution ?

https://github.com/anasfanani/Magisk-Tailscaled/issues/14#issuecomment-2030569785

Open Magisk -> Settings -> touch Systemless hosts (this will create the necessary directory structure and the hosts file.

Use an editor of your choice and edit the file /data/adb/modules/hosts/system/etc/hosts

Also I'm already test install headscale server, tested with default configuration and working perfectly without problem.

Can you tell me any custom setting for headscale so I cant test it for myself ?

Dont forget to join my telegram group for discussion.

bioluks commented 6 months ago

Can confirm @ri-char 's solution is the only way for me right now. If the proxy is a VPN one (not a local one, making external connections) only logging in is possible but not connecting to other devices in the tailscale network...

Sadly couldn't understand the CoreDNS solution provided here. Also I don't have a resolv.conf in my system, and I need to have everything systemless. Documentation on that workaround could be better if someone else understood it. EDIT: Hardcoded the IP address into the hosts file of the Headscale instance, and it seems to do the trick without the Proxy needed (at first didn't understand the workaround you quoted). But just like the proxy solution this only makes me able to login, inbound connections to my phone work as well, outbound ones are not working... So my phone cannot access other peers, but it works the other way around? The phone can also ping IP addresses that are not present on the network,

Also when doing ip a in Termux I can't see my Tailscale subnet, is that because we use userspace-networking?

There is no special Headscale config I'm using right now, followed the official docs. Only difference is I had to use the latest alpha (0.23.0-alpha9 to be specific) because of a private key creation error present in the latest stable container verson for some.

Planning on trying out your pre-release and reporting back.

anasfanani commented 6 months ago

I test headscale with default configuration, connecting with IP directly will solve the dns error issue, but in advance if you want use domain name instead of IP, your solution already answered https://github.com/anasfanani/Magisk-Tailscaled/issues/14#issuecomment-1987257234

Using /etc/hosts

Using /etc/hosts file as described here https://github.com/anasfanani/Magisk-Tailscaled/issues/14#issuecomment-2030569785

Using /etc/resolv.conf or Coredns

Add a dns server to /etc/resolv.conf or using Coredns

The tailscale binary in this module is designed for linux, because Android doesnt have /etc/resolv.conf by default, fill /etc/resolf.conf with nameserver 1.1.1.1

Or you can use coredns trick, in the future I will add aditional zip file for coredns.

Download the zip file in latest release.

wget https://github.com/coredns/coredns/releases/download/v1.11.1/coredns_1.11.1_linux_arm64.tgz
tar -zxf coredns_1.11.1_linux_arm64.tgz
chmod +x coredns

Refer to this https://coredns.io/manual/toc/#configuration Create a file named as Corefile and fill with:

. {
    forward . 1.1.1.1:53
}

Run with ./coredns

This only temporary solution btw.

Update:

After test the exit-node feature on Android, tailscaled read the file /etc/resolv.conf for dns. not using localhost:53 as dns, so the best solution right now is fill the /etc/resolv.conf with dns.

Bellow script will add /etc/resolv.conf after reboot.

su
mkdir -p /data/adb/modules/magisk-tailscaled/system/etc/
printf "nameserver 1.1.1.1\nnameserver 1.0.0.1" >> /data/adb/modules/magisk-tailscaled/system/etc/resolv.conf
anasfanani commented 6 months ago

I'm sorry I forgot to answer your question xD

So my phone cannot access other peers, but it works the other way around? The phone can also ping IP addresses that are not present on the network,

Also when doing ip a in Termux I can't see my Tailscale subnet, is that because we use userspace-networking?

Yes. Because we use userspace-networking , every connection to the tailscale network must use local socks5 proxy on port 1099

You can test it with curl -x 0.0.0.0:1099 ip_tailscale

In my latest pre-release version, I add socks5 tunnel, so traffict to the 100.x.x.x.x bt default using socks5 on port 1099

bioluks commented 6 months ago

Thanks for the detailed explanation. I was able to understand it and make it work. The pre-release made outbound connections possible as well. If using third party proxy apps the CPU usage is very high even on a high end phone, but the included hev-socks5 is not heavy on the CPU at all, so I saw no battery drain.

The resolved.conf modification inside the module directory makes totally sense, thanks for that as well.

I will keep the issue open even though my problems are solved, we are using hacky ways (manual hosts file entries) to use it and it's not working out of the box (for headscale and probably other open-source tailscale servers) for now.

anasfanani commented 6 months ago

Thankyou for your report about CPU usage and no battery drain.

Yes we are use the hacky ways because official tailscale doesnt support the binary to run normally in android device, the only way is using userspace networking mode with verry limited tailscale capabilities, maybe because they had 2k issues and doesnt have time to resolve manny issue about running the binary in android devices.

Since I cant writing in go lang, cannot help about binary problem, just willing other go developer want to fix the binary problem in android.

931122 commented 6 months ago

I have tried to compile tailscale by ndk and the dns issue has been resolved, but I don't know if there will be other issues.

anasfanani commented 6 months ago

I have tried to compile tailscale by ndk and the dns issue has been resolved, but I don't know if there will be other issues.

Great news, I've never tried compiling it myself, I've wanted to try it but I haven't had time to do it.

You may try this also.

https://github.com/termux/termux-packages/issues/10166#issuecomment-1974835672 https://tailscale.com/kb/1207/small-tailscale

931122 commented 6 months ago

I have tried to compile tailscale by ndk and the dns issue has been resolved, but I don't know if there will be other issues.

Great news, I've never tried compiling it myself, I've wanted to try it but I haven't had time to do it.

You may try this also.

termux/termux-packages#10166 (comment) https://tailscale.com/kb/1207/small-tailscale

I'm using cross-compilation, There are some problems, but I don't know how to solve them

anasfanani commented 6 months ago

I have tried to compile tailscale by ndk and the dns issue has been resolved, but I don't know if there will be other issues.

Great news, I've never tried compiling it myself, I've wanted to try it but I haven't had time to do it. You may try this also. termux/termux-packages#10166 (comment) https://tailscale.com/kb/1207/small-tailscale

I'm using cross-compilation, There are some problems, but I don't know how to solve them

What problem bro ? I try to develop with android sdk, check my other branch, maybe you can help

931122 commented 6 months ago

我尝试通过ndk编译tailscale,dns问题已经解决,但不知道是否还会有其他问题。

好消息,我从未尝试过自己编译它,我想尝试一下,但没有时间去做。 你也可以尝试这个。 termux/termux-packages#10166(评论) https://tailscale.com/kb/1207/small-tailscale

我正在使用交叉编译,出现一些问题,但我不知道如何解决

兄弟,有什么问题吗? 我尝试使用 android sdk 进行开发,请查看我的其他分支,也许你能帮忙

ping is ok 64 bytes from 100.0.0.1: icmp_seq=127 ttl=64 time=29.4 ms 64 bytes from 100.0.0.1: icmp_seq=128 ttl=64 time=32.2 ms 64 bytes from 100.0.0.1: icmp_seq=129 ttl=64 time=26.0 ms

but ping the node behind the route(10.0.0.x) is bad, and ssh is also unavailable

anasfanani commented 5 months ago

I have compile for android with sdk, new release should fix dns issue and other issue.

anasfanani commented 5 months ago

我尝试通过ndk编译tailscale,dns问题已经解决,但不知道是否还会有其他问题。

好消息,我从未尝试过自己编译它,我想尝试一下,但没有时间去做。 你也可以尝试这个。 termux/termux-packages#10166(评论) https://tailscale.com/kb/1207/small-tailscale

我正在使用交叉编译,出现一些问题,但我不知道如何解决

兄弟,有什么问题吗? 我尝试使用 android sdk 进行开发,请查看我的其他分支,也许你能帮忙

ping is ok 64 bytes from 100.0.0.1: icmp_seq=127 ttl=64 time=29.4 ms 64 bytes from 100.0.0.1: icmp_seq=128 ttl=64 time=32.2 ms 64 bytes from 100.0.0.1: icmp_seq=129 ttl=64 time=26.0 ms

but ping the node behind the route(10.0.0.x) is bad, and ssh is also unavailable

Yes, my ping also bad while connect to route and use --accept-routes=true, ssh to out or in ?

yqs112358 commented 2 months ago

Encounted with the whole same problem.

Solved as #14 instructed:

  1. Open Magisk -> Settings -> touch Systemless hosts
  2. add DNS records ip<->domain of my headscale server in /data/adb/modules/hosts/system/etc/hosts
  3. reboot