zerotier / ZeroTierOne

A Smart Ethernet Switch for Earth
https://zerotier.com
Other
14.54k stars 1.7k forks source link

Make MTU configurable #74

Closed adamierymenko closed 7 years ago

adamierymenko commented 10 years ago

Easy to do, and some bridging users will want to set it to 1500 to minimize pain with old TCP stacks.

adamierymenko commented 10 years ago

Going to do a bit more testing to see if this is really needed, but it'll probably happen.

maci0 commented 9 years ago

I was able to set the mtu using netsh on windows and ip on linux. it seemed to have an effect. I don't quite understand why its set to 2800 anyway, since the underlying network adapters all use 1500 and this would cause the packet to be split up.

adamierymenko commented 9 years ago

That's interesting. It's not going to stick though -- you've overridden the MTU with a lower number but the next time the interface is re-opened it'll be reset. I also guarantee it won't work for higher numbers than 2800. 1500 will fit in a 2800 max frame size, but 4000 won't.

The impetus for making MTU size an issue was bridging and compatibility. Almost all LANs have 1500 as an MTU. If you bridge that to a virtual LAN with 2800, now you potentially have an issue where something on the virtual side tries to send 2800-byte frames to the physical side and that won't work. They'll be dropped at the bridge. The other direction however will work just fine.

In practice though I've found this to not be a big deal. All the TCP stacks I've seen in the wild so far implement path MTU discovery and will deal with this, though it will cause slower connection establishment from the 2800-MTU side. I've also heard (though haven't confirmed for myself) that some bridges are smart enough to forge an ICMP error reply to the sender to help with PMTU discovery.

But I'll probably still do this. Something tells me that some devices, like really old OSes or printers/scanners etc., might have minimalistic or crummy TCP stacks that would just fail and die in this scenario.

kraney commented 8 years ago

Kernels prior to 2.6.24 do not allow the MTU of a tap device to be configured, so you're stuck with 1500, period. I found this on a ReadyNAS Duo while trying to port ZeroTier there. I was able to compile under gcc-3.4 with some tweaks, and had to add iproute2, but hit a wall with the MTU. So as you say, this would be helpful on really old OSes.

Relevant change here (from 2007) https://github.com/torvalds/linux/commit/4885a50476b95fa0f4caad179a80783508c2fe86

adamierymenko commented 8 years ago

I don't think we'd do much to support a kernel that old, but there are other reasons that MTU configurability would be good. Might also be nice to make it larger under certain circumstances.

So far it has not been a high priority since it turns out Linux's bridging driver handles MTU mismatch very well. It's easy to clamp TCP MSS to the smaller of the two networks' MTUs, making TCP cross the boundary just fine. Even without that most TCP stacks will discover the MTU.

kraney commented 8 years ago

Yeah, I wouldn't really advise putting much priority on the old kernel or the ReadyNAS, even myself. The device really doesn't have enough CPU horsepower to support it, anyway, it was just a project of curiosity 'for fun'. I posted mainly because in the process, I found the exact cutoff for 'old OSes' in the Linux case, which could be useful info for prioritization.

Also I agree that allowing larger sizes is really the more interesting case - particularly in AWS EC2, where 9000 byte jumbo frames are the default.

adamierymenko commented 7 years ago

Done in dev. Still needs to be done in Central UI.

adamierymenko commented 7 years ago

MTU can now vary between 1280 and 10000. I don't think there will "ever" be much need for more than 10000, and in any case this would not work well due to the massive number of fragments this would entail.

typcn commented 6 years ago

Hi, How to configure this in the new Central UI? Seems there are no MTU option

asbjornenge commented 6 years ago

@adamierymenko can I just set the MTU directly on the interface?

ip link set zt0 mtu 1460

Or is there some other way? I'm having MTU issues inside GCP. Seems they expect 1460...

andrewgdotcom commented 5 years ago

MTU can now vary between 1280 and 10000.

How? I see no option for it in either the web interface or the API of the network controller. There is an "mtu" option in zerotier-cli but it does nothing.

andrewgdotcom commented 5 years ago

Are there any instructions how to do this? This is killing us.

laduke commented 5 years ago

curl -X POST "https://my.zerotier.com/api/network/${NETWORK_ID}" -H "Authorization: bearer ${TOKEN}" -d '{"config": {"mtu": 1432}}'

andrewgdotcom commented 5 years ago

Thanks! Unfortunately it doesn't seem to be the root of our problems... back to the lab. :-(

andrewgdotcom commented 5 years ago

I finally got a response by reducing the MTU to the bare minimum of 1280.

laduke commented 5 years ago

@andrewgdotcom out of curiosity, what kind of network is this on?

andrewgdotcom commented 5 years ago

@laduke We have a an assorted collection of hosted servers, mostly in Hetzner but some in online.net, Linode and also gcloud. Tidying that up is a work in progress... :-) We use zerotier to bind them together in a distributed VLAN. This mostly works well, but we were having reliability issues connecting to the gcloud servers from some other hosts. Turns out that the Hetzner virtual network firewall is somehow blocking PMTUD, which means that connections from Hetzner physical machines to gcloud will crash on the first large packet received. tcpdump on the gcloud box shows errors of the form:

15:24:51.340825 IP *censored* > *censored*.33804: UDP, bad length 1444 > 1432

This error occurs on all connections to gcloud over ZT, but the non-Hetzner machines can use PMTUD to recover; the Hetzner machines cannot, unless I disable the network virtual firewall, which I am unwilling to do long-term.

I first tried reducing the MTU to 1432, but the exact same errors still appeared. I then assumed that there was some overhead, so reduced to 1420 but no change. 1400, no change. The same errors appeared - with the same offending numbers (!?).

So I said "screw it" and reduced all the way to 1280 to see what would happen. And it worked...

andrewgdotcom commented 5 years ago

I have also now just discovered that reducing the MTU breaks Docker. http://atodorov.org/blog/2017/12/08/how-to-configure-mtu-for-the-docker-network/ :-1:

laduke commented 5 years ago

Thanks!

myfingerhurt commented 5 years ago

I experienced some issue with default mtu 2800 . I was trying to access my AC86U-merlinwrt 384.13 router's ssh & webui through zerotier such as 10.9.8.4. It worked perfectly until I moved house and changed my ISP.

With the default MTU 2800, as long as I typed ps/ifconfig in console through zerotier address from remote node the putty would hang at around 10 lines output, and got disconnected after few seconds.

router@RT-AC86U:/tmp/home/root# ps
  PID USER       VSZ STAT COMMAND
    1 router    9352 S    /sbin/init
    2 router       0 SW   [kthreadd]
    3 router       0 SW   [ksoftirqd/0]
    5 router       0 SW<  [kworker/0:0H]
    7 router       0 SW   [rcu_preempt]
    8 router       0 SW   [rcu_sched]
    9 router       0 SW   [rcu_bh]
   10 router       0 SW   [migration/0]
   11 router       0 SW   [watchdog/0]

putty hung from here

Also tested webui as following. Shortter returns worked, but when loaded the main page it hung.

>curl http://10.9.8.4/

<HTML><HEAD><script>top.location.href='/Main_Login.asp';</script>
</HEAD></HTML>
curl http://10.9.8.4/Main_Login.asp
Nothing till timeout 

I saw this https://github.com/zerotier/ZeroTierOne/issues/935 post and changed my zerotier MTU to 1388. In my case zerotier works with MTU below 1388. ifconfig ztzlgf7vul mtu 1388

Boom! Instantly putty works and webui also can be open by browser.

I have 3 ac68u and 2 of them are working well with zertier MTU 2800, but 1 was behaving the same as ac86u behind the same ISP. I had to change both of their mtu to 1388.

dtoubelis commented 4 years ago

zerotier should not rely on packet splitting considering proliferation of IPv6. We already have a situation when we have a IPv4-via-IPv6 GRE tunnel. So, any packets larger than minimum MTU of this link are dropped even if these IPv4 packets. As a result zerotier doesn't work through this link since it has effective MTU around 1330. MTU discovery is working properly on this link but it seems that zerotier is not aware of the MTU of the link.

JocPelletier commented 4 years ago

I'm also having MTU issues, when I change from 2800 to 1280 with ip link set everything works perfectly

I can't SSH some of my peers but they are online on the network. Is there a way to change MTU of the entire network? I tried the curl command above, but MTU remains 2800 even if I confirmed the "mtu": 1280, config with a curl GET

laduke commented 4 years ago

You might have to restart the zerotier service, or leave and rejoin the network.

JocPelletier commented 4 years ago

You might have to restart the zerotier service, or leave and rejoin the network.

I already tried to Unauthorize / reauthorize the network member. Also tried to restart it, it's a RaspberryPi using DietPi OS. ip a still give me mtu 2800 for my zerotier interface

pageuppagedown commented 3 years ago

curl -X POST "https://my.zerotier.com/api/network/${NETWORK_ID}" -H "Authorization: bearer ${TOKEN}" -d '{"config": {"mtu": 1432}}'

This worked for me; however, I had to discover the proper MTU to use. I've got a couple of moons configured at Linode, and was having lots of problems SSH'ing to them while on hotel wireless.

I used the following command to determine the proper MTU to use. (Suggest was found here.)

ping -s $((1390 - 28)) -D <IP ADDRESS OF REMOTE ZEROTIER NODE> -c 1

I lowered the number over and over until I eventually got it down to 1354 before the ping would started working. After figuring out the right number, I manually adjusted the MTUs on the ZeroTier interfaces on my two boxes at Linode, and everything worked great with SSH.

NOTE: If your testing works with the above number (1390), you should increase it until it stops working. Ideally, you want to find the maximum number and use that. Your mileage will vary, and what works for one network may not work for another.

Unfortunately, manual MTU changes do not persist between ZeroTier restarts. To make it permanent, I used the above curl command to set the MTU across the whole network. In order to execute the command, though, I needed to create an API token by going here. I saved it as an environment variable (ZEROTIER_API_TOKEN) so I could run the following command.

curl -X POST "https://my.zerotier.com/api/network/<NETWORK ID>" -H "Authorization: bearer ${ZEROTIER_API_TOKEN}" -d '{"config": {"mtu": 1354}}'

It appears that the MTU is set whenever ZeroTier starts, so in order to ensure that you've got the right MTU, you must restart services. Since I'm using Debian, I just used the following.

systemctl restart zerotier-one

I would imagine that this may end up making its way to the web GUI (my.zerotier.com) at some point in the future, if it isn't already (didn't find it when I looked).

Hope this helps!

romanos-p commented 2 years ago

Changing it for the entire network might not be the best solution. I'm the only one on the network that needs a lower MTU and it seems to be because I use a full tunnel VPN. The others have no problem working with 2800. Is there a way to configure it my client to use a lower MTU ?

christian-schlichtherle commented 1 year ago

Seems like everyone is having success with different MTU sizes. For our network, I used binary search to figure that MTU=1292 works for every node. I've tested with iperf -s and iperf3 --bidir -c ....

For the record (I know this is a very late answer to @romanos-p): I would not set different MTUs for different nodes or otherwise you will inevitably have connectivity issues.

pjkundert commented 2 months ago

This is still a problem (clients that fail to communicate are running ZeroTier version 1.14.0). For me, if found setting an MTU of 1194 to be necessary.

I'm investigating whether this perhaps is actually an MTU problem in the underlying transport, not in ZeroTier?

laduke commented 2 months ago

I think the issue is the physical MTU. Lowering the MTU on a virtual network happens to help by side-effect.

Has anyone tried settign the physical MTU in local.conf?

"physical": { /* Settings that apply to physical L2/L3 network paths. */
        "NETWORK/bits": { /* Network e.g. 10.0.0.0/24 or fd00::/32 */
            "blacklist": true|false, /* If true, blacklist this path for all ZeroTier traffic */
            "trustedPathId": 0|!0, /* If present and nonzero, define this as a trusted path (see below) */
            "mtu": 0|!0 /* if present and non-zero, set UDP maximum payload MTU for this path */
        } /* ,... additional networks */
    },

the minimum physical MTU is capped at to 1400 in the code though