MichaIng / DietPi

Lightweight justice for your single-board computer!
https://dietpi.com/
GNU General Public License v2.0
4.71k stars 492 forks source link

Dietpi-9.3 issues with all existing services going down while using tailscale as an exit node. #7030

Open zappydood opened 3 months ago

zappydood commented 3 months ago

Creating a bug report/issue

Required Information

Additional Information (if applicable)

Steps to reproduce

Execute tailscale up --advertise-routes XXX.XXXX.XXX/XX --accept-routes --advertise-exit-node (order adjusted as preferred by Tailscale). and then wait....til it goes down

Expected behaviour

Tailscale should continuously operate in the background, advertising and accepting subnets, and functioning as an exit node. Samba services and SSH access should remain stable and operational. AdGuard Home should continue to function as a DNS server without interruption.

Actual behaviour

Samba Access Drops: Despite being visible in macOS Finder, Samba access stops unexpectedly. AdGuard Home and SSH Failure: AdGuard Home ceases to function alongside SSH service failures. Temporary Resolution: Functionality resumes temporarily after executing tailscale down. Reliable as Exit Node: Despite these issues, the device reliably functions as an exit node even when it goes down. Frequency: These issues have occurred twice, with the first instance taking days and the second only hours. Logs: I uninstalled Tailscale before running dietpi-bugreport, affecting its relevance as logs had already been cleared by DietPi’s RAM log system. However, I saved copies of journalctl and dmesg logs beforehand.

Extra details

Timing of Reports: The dietpi-bugreport was run after logs were cleared due to being busy with work. Log Submission: I have the saved logs available but am unsure how to securely submit them since they are no longer on the DietPi. Guidance on securely submitting these logs would be appreciated. I am open to attempting to replicate the issue if necessary, but my schedule makes that difficult to do immediately. I plan on inspecting the logs I saved tonight, and would like to provide those as well if it helps.

MichaIng commented 3 months ago

Please check the service logs about why they were stopped (or crashed), when you face this issue:

journalctl -u smbd -u nmbd -u adguardhome -u dropbear -u ssh

In case you have no local access, and SSH drops, you might need to enable persistent system logs:

dietpi-software uninstall 103 # disables RAMlog
mkdir /var/log/journal # enables persistent system/journal logs
reboot # needed to really enable it, since /var/log is still a tmpfs until reboot

Also, did you try to skip the routing options? Probably accepted or advertised routes conflict with each other and/or this device being an exit node. Loops are theoretically possible if multiple nodes advertise the same routes, not sure whether Tailscale has some internal prevention for this. This would be actually a good explanation, since I cannot imagine that Tailscale really affects the status of other services, but more that incoming traffic aimed to be handled by the device itself is instead routed elsewhere, or the answers are routed elsewhere, similar to when you enable a strict killswitch on a VPN server.

zappydood commented 2 months ago

Okay thank you for your help! I haven't gotten around to doing it again just yet but I plan on doing Tailscale up again this evening so I will use your tips and advice and report back. But I have a question for the setup.

So on diet pi I set it in my openwrt router as my used dns server. So all devices are dependent on it as well as unbound running with AdGuard home. So I'll know really quick if it's an issue. When I was doing this before I hadn't fully transitioned over to using it as the networks only dns (it was also using the built in dnsmasq in openwrt, but now I've got it set to only the dietpi for AdGuard home. Do you have any recommendations on Tailscale settings for AdGuard home/unbound to work properly for my local network as well as over the talent? I can for sure live without setting it as my exit node, but ideally I would also like to be able to utilize that as well given its 2.5 gb ports are far faster than my Apple TV as an exit node. But I'm gonna take it one step at a time I just wanted to know if you have any recommended settings for Tailscale? For context I'll include some information on dietpi and Tailscale so I hope it's not too many details.

I've read a lot of the Tailscale guides but so far the dns options seem to clash with Tailscales in-built magic dns as things seem to stop working for some devices somewhere at some point when enabling the split-dns. I'm probably doing something wrong somewhere. Any tips would be greatly appreciated. I have only two routes being advertised one by my Apple TV and one from my travel router and they don't collide with each other. 10.xxx.xxx.xxx/24 & 192.168.xxx.xxx/24 would it be best to only have one subnet router per subnet as I may stupidly been doing diet pi and the Apple TV as subnet routers and I feel like that was a mistake. Thanks again!

MichaIng commented 2 months ago

I have not much experience with Tailscale either. So I think you use its internal MagicDNS feature? https://tailscale.com/kb/1054/dns#using-dns-settings-in-the-admin-console

If I understand it right, this is a client/node side setting, to enable hostnames for other Tailscale nodes. I guess, for this to work, the Tailscale daemon itself then functions as and configures itself as local DNS resolver. If you enable this on the DietPi system, it would collide with AdGuard Home then, trying to listen on the same port 53. But on all other nodes, you could enable MagicDNS and configure the DietPi system (its Tailscale IP address) as global nameserver. I guess this was the plan anyway, to have ad blocking for remote devices?

10.xxx.xxx.xxx/24 & 192.168.xxx.xxx/24 would it be best to only have one subnet router per subnet as I may stupidly been doing diet pi and the Apple TV as subnet routers and I feel like that was a mistake. Thanks again!

As long as there is only one route for each subnet, i.e. one for 10.xxx.xxx.0/24 and one for 192.168.xxx.0/24, this should be all fine. But two subnet routers for the same subnet would likely be an issue, as other nodes then would not know which of both routes to use, probably sending requests randomly to either one or the other. Or requests are routed in an endless cycle between the two routers, as both follow the route distributed by the other one? 😄 Not sure whether Tailscale prevents this at the client/node side, to accept and configure only one of the two routes, based on some priority system.

In any case, it makes sense, for debugging, to test everything step by step, i.e. disable all subnet routes at first, then enable them one by one and see whether the DietPi system can still be reached from within LAN and Tailscale network.