microsoft / WSL

Issues found on WSL
https://docs.microsoft.com/windows/wsl
MIT License
17.45k stars 822 forks source link

UDP packets of size less than 12 bytes are not sent from WSL to host #8610

Closed qqshka closed 2 months ago

qqshka commented 2 years ago

Version

Microsoft Windows [Version 10.0.22000.795]

WSL Version

Kernel Version

5.10.102.1

Distro Version

Ubuntu 20.04

Other Software

No response

Repro Steps

  1. Run some sort of UDP server on the host system, in my case I use primitive server writen in golang. Wireshark also is fine.
  2. Send UDP packets from the WSL2 system to the host, in my case I use netcat command line utility. wsl2-udp-bug

Expected Behavior

I sent few UDP packets from WSL2 to the host, starting with UDP payload size of 2 bytes up to 13 bytes, expecting to receive all packets.

Actual Behavior

Only packets with payload size 12 or more received on the host.

Diagnostic Logs

No response

elsaco commented 2 years ago

@qqshka try scapy also. While using nc couldn't send small UDP payloads, it works with scapy:

>>> send(IP(dst="192.168.64.1")/UDP(sport=5201,dport=5201)/Raw(load="1"))
.
Sent 1 packets.
>>> send(IP(dst="192.168.64.1")/UDP(sport=5201,dport=5201)/Raw(load="12"))
.
Sent 1 packets.
>>> send(IP(dst="192.168.64.1")/UDP(sport=5201,dport=5201)/Raw(load="123"))
.
Sent 1 packets.

and the listener on Windows side:

image

qqshka commented 2 years ago

And your point is? Yes scapy works while nc does not works. The thing is nc here was as example how to reproduce it, the fact is almost any application could not send short UDP packets. scapy seems to be an exception since I guess it uses raw sockets or something (and probably calculates checksums on its own). I checked difference between packets from nc and scapy and for nc you may notice bad checksum. First packet sent with scapy, second with nc.

root@DESKTOP-IVAN:~# tcpdump -n -i eth0 udp port 25000 -vvv -X
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
01:53:48.723346 IP (tos 0x0, ttl 64, id 1, offset 0, flags [none], proto UDP (17), length 29)
    172.22.123.250.25000 > 172.22.112.1.25000: [udp sum ok] UDP, length 1
        0x0000:  4500 001d 0001 0000 4011 36a7 ac16 7bfa  E.......@.6...{.
        0x0010:  ac16 7001 61a8 61a8 0009 c762 31         ..p.a.a....b1

01:54:26.275311 IP (tos 0x0, ttl 64, id 10470, offset 0, flags [DF], proto UDP (17), length 30)
    172.22.123.250.38966 > 172.22.112.1.25000: [bad udp cksum 0x4444 -> 0x90c8!] UDP, length 2
        0x0000:  4500 001e 28e6 4000 4011 cdc0 ac16 7bfa  E...(.@.@.....{.
        0x0010:  ac16 7001 9836 61a8 000a 4444 310a       ..p..6a...DD1.

Also, I should admit that problem seems to arise after I installed recent update of the Windows.

lucasdrufva commented 2 years ago

Got the same issue after a recent windows update, currently on OS version 19044.1826

wavemobileDev commented 2 years ago

After a lot of messing around trying to debug a test system I can confirm that I have found exactly the same issue. Two of our test systems failed pretty much at the same time, both use UDP based comms (small packets) and Win10/11 with WSL Ubuntu 20.04.4 LTS. Both systems also received Windows updates over the time of failure.

Thank you to qqshka for reporting this - I was beginning to think I was going mad as I went down the rabbit hole of routing, bridging, NAT'ing and firewalls before stumbling on the length issue by typing a cursing rant into netcat!

Win10 Ver: 19044.1826 Win11 Ver: 22000.795

markymarrow commented 2 years ago

I'm seeing similar behaviour from an ubuntu-22.04 hyper-v host. I've spent the day wondering why a tftp transfer was failing - the last packet was 4 bytes, Adding 8 zero bytes to the end of the file 'fixed' the transfer.

So, might not be WSL's fault?

This is running Windows 11 pro 21H2 (22000.795)

gdumke commented 2 years ago

I used iperf3 between a WSL2 distro on a Win10 machine (client side) and a Fedora 36 machine (server side) to test the UDP transfers:

iperf3 command on client side was sudo iperf3 -4 -u -c xx.xx.xx.xx -p 5555

However, the same iperf3 command entered in a classic cmd console (WSL not involved) works whatever the Win10 version is. Wireshark shows that the first UDP packet exchanged by iperf3 has a 4-byte payload.

Not found any workaround so far.

jstarks commented 2 years ago

We are looking into this. In the meantime, you should be able to work around this by disabling checksum offload from within WSL. As root:

ethtool -K eth0 tx off

This is global across all your distros, but it is not persisted across WSL restarts.

thegitworker commented 2 years ago

I have a similar issue, except my UDP packets being blocked are larger than 12 bytes. According to Wireshark the packets are 118 bytes of data. The work around posted by @jstarks fixed the issue. Before entering the command Wireshark was completely void of UDP packets and after entering in the command I was able to see UDP traffic come through.

What I wanted to note is this started with KB5015807. When my laptop installed KB5015807 I immediately noticed the issue of UDP packets missing. I blocked this update, but today I experienced the same issue with KB5016616. It was previously mentioned that the issue occurred after updating, but not which updates so I wanted to add that information in case it helps figuring out this issue.

Summary: I have UDP packets larger than 12 bytes not leaving WSL2 (118 bytes of data) Before KB5015807 and KB5016616: UDP in WSL2 works perfectly fine After installing KB5015807 or KB5016616: UDP in WSL2 breaks After installing KB5015807 or KB5016616 and using ethtool: UDP in WSL2 works perfectly fine

dktapps commented 2 years ago

I and my team have encountered this in several programs including PHP.

I can confirm that uninstalling update KB5015807 solves the problem.

xulonc commented 2 years ago

I have same problem.
wireshark can't capture udp data ,when byte < 12. 141b89060314b6513b340fe76127fe4

julianxhokaxhiu commented 2 years ago

Can confirm I had the same problem testing a simple game server echo bot from Agones.

kvnnap commented 2 years ago

Same issue here when using netcat inside WSL2 (or any app using UDP). Small packets do not go through to the receiving end while larger packets do go through. Receiving packets from the other end into WSL2 netcat client works for any size.

BubbaJoeX commented 2 years ago

Can confirm this issue.

PJB3005 commented 2 years ago

Can confirm, this isn't specific to WSL. It also happens on other Hyper-V VMs, which is how I found this issue (god what a debugging nightmare).

gdumke commented 2 years ago

We are looking into this. In the meantime, you should be able to work around this by disabling checksum offload from within WSL. As root:

ethtool -K eth0 tx off

This is global across all your distros, but it is not persisted across WSL restarts.

@jstarks what is the status on this issue please? We desperately need a fix for this as it impacts a few of our products on the field. The workaround you mentioned is helping a lot but it makes our processes heavier (we need to restart the WSL engine quite a few times a day as we often need to bridge/unbridge the WSL network in order to send broadcast messages from our distros), and not all of us are familiar with CLI.

Is there an estimate on when a fix will be released?

gdumke commented 1 year ago

It seems fixed by a recent Windows update. However I don't know which one. I'm running Windows 10 Pro 21H2 build 19044.2130 (last update being KB5018410) and it looks okay.

julianxhokaxhiu commented 1 year ago

Can confirm that I'm also on 10.0.19045.2251 and it seems to work fine now.

xulonc commented 1 year ago

I'm windows 11 ( 22H2 22621.819) work fine

grpatter commented 1 year ago

Win 11 (22H2 22621.525) still has this problem.

As per https://support.microsoft.com/en-gb/topic/windows-11-version-22h2-update-history-ec4229c3-9c5f-4e75-9d6d-9025ab70fcce 22621.525 is apparently KB5019311, released Sep 27, 2022. 22621.819 is apparently KB5019980, released Nov 8, 2022.

If necessary, this may help someone track down the specific change.

Cyvster commented 1 year ago

I had this issue the entire time I was running server 2016. I was running a game called SWGEMU in a hyper-v VM with Ubuntu 20.04. I was encountering constant disconnects in that game until I ran the above command(ethtool -K eth0 tx off). I had to run that command every time I started the VM.

I just upgraded to server 2019 and the issue is resolved. I no longer have to run that command on the VM to stay connected to the game server.

esolk commented 1 year ago

We are looking into this. In the meantime, you should be able to work around this by disabling checksum offload from within WSL. As root:

ethtool -K eth0 tx off

This is global across all your distros, but it is not persisted across WSL restarts.

@jstarks what is the status on this issue please? We desperately need a fix for this as it impacts a few of our products on the field. The workaround you mentioned is helping a lot but it makes our processes heavier (we need to restart the WSL engine quite a few times a day as we often need to bridge/unbridge the WSL network in order to send broadcast messages from our distros), and not all of us are familiar with CLI.

Is there an estimate on when a fix will be released?

This solve my problem. I'm working on 22621.2215 , and the client is Ubuntu18.04 on WSL2. Thanks alot

microsoft-github-policy-service[bot] commented 2 months ago

This issue has been automatically closed since it has not had any activity for the past year. If you're still experiencing this issue please re-file this as a new issue or feature request.

Thank you!