tailscale / tailscale

The easiest, most secure way to use WireGuard and 2FA.
https://tailscale.com
BSD 3-Clause "New" or "Revised" License
17.55k stars 1.34k forks source link

Tailscale fails to maintain local lan access after systemd network services restarted #9413

Open dbullerwell opened 9 months ago

dbullerwell commented 9 months ago

What is the issue?

We use Tailscale on Ubuntu servers at customer sites. In order to limit the number of firewall rules required for our clients to allow, we setup our servers to use a Tailscale exit node, but allow local lan traffic so they can communicate with each other, and other local devices to run our applications.

Example command:

tailscale up --exit-node=our-exit-node --exit-node-allow-lan-access

This command properly modifies the ip routes to add in throws to table 52, for any local lan traffic, based on the network interfaces.

However we also have auto updates running in the background, which can occasionally cause the systemd-networkd service to restart. When this happens the ip routes are destroyed, and Tailscale will appropriately recreat entries for all devices in the tailnet that it is able to route to, as well as setting the default route to go through Tailscale, however it doesn't automatically add back in allowing local lan access. This requires manual intervention today to restart the Tailscale systemd service in order for it to properly recreate the ip throws in table 52.

My expectation would be that as Tailscale is able to recreate ip route entries dynamically, and it would recreate them into a state that matches when the service was initially started.

This has caused quite a few random headaches for our team and customers as we have had to manually ssh onto servers to resolve the issues.

Steps to reproduce

On Ubuntu 20.04 or 22.04

Start Tailscale routing all traffic to an exit node, excluding the local lan access

tailscale up --exit-node=our-exit-node --exit-node-allow-lan-access

View the ip routes created to see appropriate throws for the local lan interfaces. I've modified to have random ip address entries, where 192.168.10.0/24 is the local lan subnet,

ip route list table 52

default dev tailscale0
100.68.1.1 dev tailscale0
100.68.1.2 dev tailscale0
throw 127.0.0.0/8
throw 192.168.10.0/24

Restart the systemd-networkd service like would happen after an apt-get upgrade

sudo systemctl restart systemd-networkd

View the current ip routes

ip route list table 52

default dev tailscale0
100.68.1.1 dev tailscale0
100.68.1.2 dev tailscale0

See that the entries for local lan access and loopback are missing, but the default for all traffic is to travel through Tailscale, to the exit node.

Restart tailscaled service and see that the entries get recreated

sudo systemctl restart tailscaled

ip route list table 52

default dev tailscale0
100.68.1.1 dev tailscale0
100.68.1.2 dev tailscale0
throw 127.0.0.0/8
throw 192.168.10.0/24

Are there any recent changes that introduced the issue?

No response

OS

Linux

OS version

Ubuntu 22.04

Tailscale version

1.48.2

Other software

No response

Bug report

No response

amosbird commented 8 months ago

I'm facing the same issue: When a Docker subnet is created after Tailscale has already been set up with 'exit-node-allow-lan-access,' the new local subnet still routes through the Tailscale exit node, which disrupts the Docker network.

Is there a way to always put the following routes in table 52?

throw 192.168.0.0/16
throw 172.16.0.0/12
throw 10.0.0.0/8
amosbird commented 8 months ago

Related https://github.com/tailscale/tailscale/issues/3646

wsmlby commented 1 week ago

This is a real bug that can be easily reproduced.

https://github.com/tailscale/tailscale/blob/0323dd01b2c2e75b399f83fcac2d8f6fe6cc28da/wgengine/router/router_linux.go#L1236 Looks like we didn't try to re-establish the local "throw" rules at this time if --exit-node-allow-lan-access is enabled.

Please help fix