zerotier / ZeroTierOne

A Smart Ethernet Switch for Earth
https://zerotier.com
Other
14.3k stars 1.67k forks source link

[Feature Request] my.zerotier.com ability to set route metric to static routes #750

Open leleobhz opened 6 years ago

leleobhz commented 6 years ago

Hello folks!

Today in my.zerotier.com, we are able to setup static routes in the managed routes screen. But this place does not allow setup route metric. This way, as example, if I connect to ZT network that have a route to the network i'm joined, the route will not work because ZT usually haves a lower metric than local interface.

Can you please add this feature?

Thanks!

glimberg commented 6 years ago

I'm not quite sure what you're asking here.

Do you have a physical network and a ZeroTier network where the IP address ranges overlap? Because if that's the case, then that's a good way to have problems. If this is the case, your best bet is to make sure that your IP address ranges don't overlap.

Additionally, not all platforms allow setting route metrics (iOS and Android for instance).

novirium commented 6 years ago

I think this is the same as the issue I'm having it my situation: I've got a ZT network being used to connect several nodes including roaming laptops together, and have a gateway node set up in the office with a managed route to direct traffic to the office subnet from the ZT one. This works great until a laptop (running Linux here) connects to the office network directly, and the managed route added by ZeroTier has higher priority than the direct LAN one. In this situation, internet access is via the LAN subnet - which the zerotier route is overriding, resulting in a complete loss of internet connectivity (except to other hosts on the ZT subnet, which still work).

Running ip route shows that the managed route added by ZeroTier has no metric, and so is defaulting to the highest priority. Manually changing this to a higher metric solves the problem.

I'm not sure what the best approach for this situation is, but interestingly it doesn't seem to be a problem on Android.

glimberg commented 6 years ago

@novirium Well that's a weird one. ZeroTier definitely should be setting a metric such that the ZeroTier route is a lower priority than the physical network. What Linux distribution & version of the distro are you using?

And no, it wouldn't be a problem on Android. Android & iOS are entirely different beasts.

laduke commented 6 years ago

Try making the ZeroTier managed route less specific than the LAN's.

novirium commented 6 years ago

I'm not seeing any metric on the added route on both Arch Linux and Debian Stretch (it's probably worth a separate issue, but on the Stretch host the local subnet route is via a bridge, and ZeroTier is removing that route and replacing it entirely). Both are running ZeroTier One 1.2.8 .

For clarity, the local subnet is 10.0.3.0/24, and the zerotier network is 10.5.5.0/24. The gateway node (10.5.5.51) is connected to both, and I have the managed route 10.0.3.0/24 : 10.5.5.51 added in Zerotier Central.

On the Arch host, ip route before connecting to the ZT network:

default via 10.0.3.1 dev enp0s31f6 proto dhcp src 10.0.3.171 metric 202 
10.0.3.0/24 dev enp0s31f6 proto dhcp scope link src 10.0.3.171 metric 202 

And after connecting to the ZT network:

default via 10.0.3.1 dev enp0s31f6 proto dhcp src 10.0.3.171 metric 202 
10.0.3.0/24 via 10.5.5.51 dev zt1 
10.0.3.0/24 dev enp0s31f6 proto dhcp scope link src 10.0.3.171 metric 202 
10.5.5.0/24 dev zt1 scope link

On the Debian Stretch host (a stock ProxMox install), before connecting to the ZT network:

default via 10.0.3.1 dev vmbr0 onlink
10.0.3.0/24 dev vmbr0 proto kernel scope link src 10.0.3.50

And after connecting to the ZT network:

default via 10.0.3.1 dev vmbr0 onlink
10.0.3.0/24 via 10.5.5.51 dev ztuku45mat
10.5.5.0/24 dev ztuku45mat scope link
laduke commented 6 years ago

Hi @novirium Is a zt adapter part of the vmbr0 bridge? In that case, sometimes you want to set allowManaged to false for that zerotier network.
zerotier-cli set $NETWORKID allowManaged=0

Does changing the managed route to 10.0.3.0/23 : 10.5.5.51 work around the metric issue?

novirium commented 6 years ago

Sorry @laduke , I'd missed your meaning before. Yes, specifying a less specific route does work and is a fairly elegant workaround, thanks.

I suspect the bridge thing is a separate issue, as the ZT adapter isn't part of the bridge, and with the different managed route now (10.0.3.0/23), the route isn't being added at all (no change at all in the routes when joining the ZT network).

That aside, should zerotier one be adding a metric to its managed routes?

leleobhz commented 6 years ago

Hello folks!

About prefix engineering, it isn't a option: Does not make any sense a prefix engineering with the following routes being installed:

192.168.0.0/24 via 10.0.0.1 192.168.1.0/24 via 10.1.0.1

As example of 2 networks from same network, If I have different routes for each one from a ZT network, run a /23 route may lead a network to be wrongly routed.

About metric: Let's take DHCP example from Linux:

 root  ~  ip r l | grep dhcp
default via 192.168.43.1 dev wlp1s0 proto dhcp metric 600 

DHCP Clients can set routing metric to prefer a link besides other (Preferring Wired Ethernet before 802.11 and before LTE - that can be always on). So what may stop metrics to be set? I see lack of API from SO a reason, but I don't believe all modern SO cannot set it.

So, routing engineering for my case does not help because I need fallback routes that are equal - and need to be - because the near prefix belongs to other network.

SimmyD commented 5 years ago

Hi All,

did a solution come from this? I have the exact thing with my raspberry pi.

Home network - 192.168.87.0/24 ZT network - 192.168.28.0/24

have a route in ZT for 192.168.87.0/24 pointing to 192.168.28.250 (linux server acting as ip router on 192.168.87.0/24)

Once the raspberry pi is connected to the home network but also to ZT i can only connect to it via the ZT address not the Home network address from devices on the home network.

mhandugan commented 5 years ago

Same issue here. The zt metric is higher than wired Ethernet, but lower than Wi-fi. So, with a route to my home LAN on zt, I cannot access Wi-Fi interfaces on-LAN. Metric for Linux eth0: 202 Metric for zt: 204 Metric for wlan0: 303

Syn/acks to a local IP go bouncing through zt and fail to make it through the statefull firewall (not sure why they don’t make it eventually).

I think I just need to go unmanaged for all hosts on the LAN, and only use zt ip/routes for iOS/Android.

bjeanes commented 5 years ago

Also same issue. It does not result in a loss of connectivity for me, but it does make all traffic from ZT network members which are also on the physical network route their traffic to the ZT gateway, which also routes to itself but eventually forwards on the packet. So for me, this just results in a higher perceived local network latency.

$ ip addr show wlp2s0
2: wlp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 9c:b6:d0:98:63:73 brd ff:ff:ff:ff:ff:ff
    inet 10.10.10.143/24 brd 10.10.10.255 scope global dynamic noprefixroute wlp2s0
       valid_lft 53214sec preferred_lft 53214sec
    inet6 fe80::94a5:e05f:907:bb76/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
$ ip route show
default via 10.10.10.254 dev wlp2s0 proto dhcp metric 600 
10.10.10.0/24 via 10.144.119.0 dev ztks5srcp7 
10.10.10.0/24 dev wlp2s0 proto kernel scope link src 10.10.10.143 metric 600 
10.144.0.0/16 dev ztks5srcp7 proto kernel scope link src 10.144.17.130 linkdown 
$ tracepath -b 10.10.10.10
 1?: [LOCALHOST]                      pmtu 1500
 1:  tumtum (10.144.119.0)                                 4.509ms 
 1:  tumtum (10.144.119.0)                                 3.242ms 
 2:  _gateway (10.10.10.254)                               4.975ms 
 3:  10-10-10-10.local (10.10.10.10)                       4.082ms reached
     Resume: pmtu 1500 hops 3 back 2 

As you can see, the 10.10.10.0/24 via 10.144.119.0 dev ztks5srcp7 route does not have a metric set and seems to sit at a higher precedence than the 10.10.10.0/24 dev wlp2s0 route.

novirium commented 5 years ago

Got curious about this again this morning, and correct me if I'm wrong (I'm by no means familiar with Linux kernel programming), but it looks like no metric is being passed at all on Linux when a route is being added. I've attempted to follow the logic through below:

A generic, default metric of 5000 is set here, and passed in to EthernetTap, which hands it out to whatever OS implementation is used. As far as I can tell, Mac and others like BSD run a command to bring the interface up, which uses the "metric" value passed down from EthernetTap when creating the interface.

On Linux though, the metric needs to be set when routes are added (there's not a simple way to set a default metric on an interface - you can't do it via ifconfig or the RTNetlink system that is used by ZT on Linux for interface creation). The command that's run to add routes on Linux is defined here, and currently doesn't have any support for setting the route metric.

Tentatively, a "metric" string could simply be added on the end of the commands run here, but at the moment the metric value isn't passed down to the ManagedRoute instance, and so some other structure would also need to be added to enable it to be set as a value on a ManagedRoute. I imagine this would have complications with similar per-route support on other OSes, so maybe instead for now the metric value could be directly pulled from the n.tap.metric when the ManagedRoute is created here? That way the metric set in EthernetTap is still the source of truth, and treated as a per-interface value.

It looks like this would require changes touching a few places down through the source, and I'm by no means familiar enough to be able to implement this personally. Being able to set a high metric would be an incredibly useful feature on Linux though, and solve a lot of weird little issues people have had using ZT as a way to provide access to local networks with portable devices.

TheBestPessimist commented 4 years ago

Bumping this as i sufffer the same problem: 2 laptops connected to the same wireless and with zerotier running. When i transfer a 1gb file, the speed is very slow (that of my isp). when i disable zerotier on any of those laptops, the transfer speed goes back to normal lan (wireless) speed.

ihipop commented 4 years ago

same issue。I think a router metric to static routers is a better Solution

ihipop commented 4 years ago

same issue。I think a router metric to static routers is a better Solution

I install ZeroTier One in my UBNT router,which has a 192.168.1.1/24 sub-net on eth1 at home,this device also has a ZeroTier private address 10.10.10.10 I add a static router 192.168.1.1/24 via 10.10.10.10 on my.zerotier.com so that my android device can visit 192.168.1.1/24 sub-net on eth1 at home ,but when I run ZeroTier One in my UBNT router,It will add a router 192.168.1.1/24 sub-net via 10.10.10.10 too,which make the router 192.168.1.1/24 sub-net on eth1 invalid and LAN device can not visit internet via UBNT router

I think a router metric to static routers on my.zerotier.com is a better Solution

pulento commented 3 years ago

Got curious about this again this morning, and correct me if I'm wrong (I'm by no means familiar with Linux kernel programming), but it looks like no metric is being passed at all on Linux when a route is being added. I've attempted to follow the logic through below:

A generic, default metric of 5000 is set here, and passed in to EthernetTap, which hands it out to whatever OS implementation is used. As far as I can tell, Mac and others like BSD run a command to bring the interface up, which uses the "metric" value passed down from EthernetTap when creating the interface.

On Linux though, the metric needs to be set when routes are added (there's not a simple way to set a default metric on an interface - you can't do it via ifconfig or the RTNetlink system that is used by ZT on Linux for interface creation). The command that's run to add routes on Linux is defined here, and currently doesn't have any support for setting the route metric.

Tentatively, a "metric" string could simply be added on the end of the commands run here, but at the moment the metric value isn't passed down to the ManagedRoute instance, and so some other structure would also need to be added to enable it to be set as a value on a ManagedRoute. I imagine this would have complications with similar per-route support on other OSes, so maybe instead for now the metric value could be directly pulled from the n.tap.metric when the ManagedRoute is created here? That way the metric set in EthernetTap is still the source of truth, and treated as a per-interface value.

It looks like this would require changes touching a few places down through the source, and I'm by no means familiar enough to be able to implement this personally. Being able to set a high metric would be an incredibly useful feature on Linux though, and solve a lot of weird little issues people have had using ZT as a way to provide access to local networks with portable devices.

So basically it is not only a management interface so also a "client" feature also.

I think this feature is a must IMHO 😃

mikesellt commented 3 years ago

I have the same issue. 10.0.0.0/24 is my home LAN. I have a Raspberry Pi acting as a relay node (connects to ZT and is the middleman between the ZT network and my home network clients so I don't have to install ZT on each client at my house). I have the route in my.zerotier.com configured as 10.0.0.0/23 to try and make it less specific than the actual LAN, and it points to the ZT IP address of the Raspberry Pi. My home router has a route for the ZT network pointed to the LAN IP of the Raspberry Pi.

This all works great. Everything can talk to each other as expected, except that when the ZT network is connected on my Android client (I haven't tested other clients), the traffic all comes from the ZT address on the Android client. I have confirmed this via traceroutes as well as logs from my local DNS server. It isn't a huge deal as the communication still technically works, but I have automations in Home Assistant and on my Mikrotik Router based on the clients' LAN IP address. I'd rather not have to duplicate all of my scripts to function with EITHER IP address.

I would just manually turn ZT off when I get home, and that's fine for me, but this setup is mainly for my kids cell phones. I'd like to use ZT to get my kids' phones back to my home DNS server (Adguard) to utilize ad blocking and content filtering. Part of the reason this would work is that they don't know how to disconnect or connect to Zerotier, and I can hide or password-protect the app.

Anyway, I hope some progress can be made on this to add routing metrics manually or at least set the ZT network by default to a higher metric.

p.s. I have also tried this same setup with Tailscale, and I have the same problem with that service.

royolsen commented 3 years ago

Bumping this. I don't always want the ZeroTier's managed routes to take preference over direct connections. The current route metric of 0 seen on my Linux clients will sometimes get in the way.

I would suggest having a higher default metric for managed routes as well as creating the ability to specify a custom metric.

palonsoro commented 3 years ago

+1 to having this feature

dnr commented 3 years ago

This seems like a pretty big missing feature. I want to use a managed route to let one node on my zt network act as a gateway for other zt nodes to get to the gateway's local network, which seems like a very common use case. I'm currently using the "slightly larger subnet" trick, but it's kind of gross. This is exactly what metrics are for.

bwolther commented 3 years ago

I have this same problem (of a ZT route overriding a local LAN route). Being able to specify a metric for each ZT route would allow me to prevent the undesirable override of the LAN route. The node on which I saw the problem was a Linux Centos 7 (x86_64) host, whose existing LAN route had a metric of 100. ZT put in a route to the same subnet, with a metric of 0 (zero), forcing traffic through the ZT network interface. +1 for implementing route metrics. (I'm using the "route ZT to a less specific subnet" solution.)

rjsocha commented 3 years ago

I have (had) the same issue with the roaming laptop. I solved this via this hack: https://github.com/zerotier/ZeroTierOne/pull/1455

This allow me to setup metric for ZT routes (Linux only, tested only on Ubuntu 20.04)

laduke commented 2 years ago

Please try this branch ☝️ , if you're a linux user that likes to compile stuff.

I think just making the via's have a high metric will solve this problem for most people. A better default at least. Then we can circle back on configurable metrics someday.

If you think this will break something, let me know what it is.

We can do a similar thing on Windows. Mac and BSD don't have route metrics, only interface metrics. So that'll be something to figure out.

Here's some example output from a linux machine

[ZeroTierOne]$ ip route
192.168.82.0/24 via 10.147.17.1 dev zt5u4uptmb proto static 
[ZeroTierOne]$ ping 192.168.82.102
PING 192.168.82.102 (192.168.82.102) 56(84) bytes of data.
64 bytes from 192.168.82.102: icmp_seq=1 ttl=63 time=2.26 ms
64 bytes from 192.168.82.102: icmp_seq=2 ttl=63 time=2.16 ms
64 bytes from 192.168.82.102: icmp_seq=3 ttl=63 time=2.24 ms
^C
--- 192.168.82.102 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 2.161/2.221/2.261/0.043 ms
[ZeroTierOne]$ traceroute 192.168.82.102
traceroute to 192.168.82.102 (192.168.82.102), 30 hops max, 60 byte packets
 1  10.147.17.1 (10.147.17.1)  1.910 ms  2.993 ms  3.040 ms
 2  birch.home.arpa (192.168.82.102)  3.085 ms  3.338 ms  3.346 ms

[ZeroTierOne]$ # patched
[ZeroTierOne]$ ip route
192.168.82.0/24 via 10.147.17.1 dev zt5u4uptmb proto static metric 9999 
[ZeroTierOne]$ ping 192.168.82.102
PING 192.168.82.102 (192.168.82.102) 56(84) bytes of data.
64 bytes from 192.168.82.102: icmp_seq=1 ttl=64 time=0.402 ms
64 bytes from 192.168.82.102: icmp_seq=2 ttl=64 time=0.418 ms
^C
--- 192.168.82.102 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1075ms
rtt min/avg/max/mdev = 0.402/0.410/0.418/0.008 ms
[ZeroTierOne]$ traceroute 192.168.82.102
traceroute to 192.168.82.102 (192.168.82.102), 30 hops max, 60 byte packets
 1  birch.home.arpa (192.168.82.102)  0.373 ms  0.275 ms  0.236 ms

actually, on macos, zerotier interfaces are getting a high metric/low priority (5000).

on linux, my physical routes are either 0 or 1024, depending on distro.

In what cases would you want your zerotier interfaces or routes to have a higher priority than a conflicting physical network?

ihipop commented 2 years ago

@laduke

actually, on macos, zerotier interfaces are getting a high metric/low priority (5000).

on linux, my physical routes are either 0 or 1024, depending on distro.

In what cases would you want your zerotier interfaces or routes to have a higher priority than a conflicting physical network?

This is the case: https://github.com/zerotier/ZeroTierOne/issues/750#issuecomment-570445251 The static route add by zerotier has a high priority which make the 192.168.1.1/24 via eth0@LAN invalid We want to custom the metric to make the physical has a high priority or we can determ which route to use with a proper configuration of metric

alteriks commented 2 years ago

@laduke I've just compiled zerotier using your commit - 5f8bd68 After connecting to my network, all static routes got 5000 metric. Finally I can roam using my laptop between lan and other networks without any issue. Thanks!

laduke commented 2 years ago

Thanks for testing!

mikesellt commented 2 years ago

Would (or could?) this fix also be applied to the Android mobile app? The main clients I have are my family members and they only use phones. It would be nice if the physical LAN network was priority over the ZT network when they connect to LAN. Glad to see progress on this! Thanks.

Crest commented 2 years ago

Try making the ZeroTier managed route less specific than the LAN's.

If their address plan allows it this heinous workaround should do the trick.

rjsocha commented 2 years ago

Thanks for this improvement!

Qubitium commented 2 years ago

Unfortunately the metric 5000 fix helps some users, and breaks the need of other users. Here is my counter point on using lower metric than local wifi/ethernet. Packet leaks to hostile network such as behind China gfw. In a hostile environment, all packets should be routed through zerotier. In this case, security comes first, and not local lan interoperability. In my case zt is acting as full tunnel (vpn).

In merged commit, linux zt route metrics have been fixed to 5000 which allows zt routes to co-exist with wifi lans.

https://github.com/zerotier/ZeroTierOne/commit/0da00bf546e06b07e0c0b127d4fdd0e525618fe8

However, this introduces an issue where users need ALL traffic, regardless of lan/wifi/wann to be routed to zt.

Case: I am in China with gfw and having metric=5000 means my dns requests are leaked through the wifi interface even though 8.8.8.8 shoud be routed via 0.0.0.0/1 option in zt allowDefault=1 as wifi routes has lower metric than 5000, at least in Fedora rawhide (37 alpha).

0.0.0.0/1 via 172.25.204.155 dev zteb4pu4gr proto static metric 5000 
default via 192.168.31.1 dev wlp1s0 proto dhcp src 192.168.31.252 metric 4000 
128.0.0.0/1 via 172.25.204.155 dev zteb4pu4gr proto static metric 5000 
172.25.0.0/16 dev zteb4pu4gr proto kernel scope link src 172.25.29.132 metric 450 
192.168.31.0/24 dev wlp1s0 proto kernel scope link src 192.168.31.252 metric 4000 

Please note the dns leak is partial systemd-resolved bug using the wrong interface as noted in https://github.com/systemd/systemd/issues/23655

However, this does not negate the fact in some hostile environments, the VPN (in this case zerotier with allowDefault=1) should receive/route all traffic disregarding local lan/dhcp-dns route priorities.

The fix is should be a dynamic configurable client side metric config for network settings default to 5000.