microsoft / WSL

Issues found on WSL
https://docs.microsoft.com/windows/wsl
MIT License
17.23k stars 808 forks source link

Network packet discrepancy/drop between Windows Host and WSL2 #10989

Open nnathan opened 8 months ago

nnathan commented 8 months ago

Windows Version

Microsoft Windows [Version 10.0.22621.2861]

WSL Version

2.0.0.0 & 2.0.14.0

Are you using WSL 1 or WSL 2?

Kernel Version

5.15.123.1-microsoft-standard-WSL2 (2.0.0.0) 5.15.133.1-microsoft-standard-WSL2 (2.0.14.0)

Distro Version

Ubuntu 22.04.3 LTS

Other Software

Wireshark tcpdump (on wsl2) Cygwin + ping

Repro Steps

  1. Run on wsl2: sudo tcpdump -w /tmp/wsl2.pcap -i eth1 host 1.1.1.1 and icmp
  2. Run on Windows host: Wireshark on internet facing network adapter (in my case Intel Wifi) using capture filter host 1.1.1.1 and icmp
  3. Run on wsl2: sudo ping -c 1000 -f 1.1.1.1
  4. Stop tcpdump and wireshark (and in wireshark save capture to c:\temp\windowshost.pcap)

Expected Behavior

0% packet loss

Actual Behavior

PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
........................
--- 1.1.1.1 ping statistics ---
1000 packets transmitted, 976 received, 2.4% packet loss, time 9309ms
rtt min/avg/max/mdev = 7.780/9.185/55.814/2.277 ms, pipe 4, ipg/ewma 9.318/9.039 ms

Diagnostic Logs

From the above steps it isn't apparent that packets are being dropped from the Host to the WSL2 VM. However the following attachment pcaps.zip contains the wsl2 pcap and the windows host pcap.

Here is the number of packets captured from both:

$ tcpdump -n -r windowshost.pcap 2>/dev/null  | wc -l
2000
$ tcpdump -n -r wsl2.pcap 2>/dev/null | wc -l
1976

In windowshost.pcap a 1000 ICMP echo requests were sent, and a 1000 ICMP echo response were received; indicating 0% packet loss.

In wsl2.cap a 1000 ICMP echo requests were sent, but only 976 ICMP echo replies were received, which accounts for the 2.4% packet loss.

Somehow 24 packets were dropped between the host and the WSL2 VM.

This is consistently repeatable on the Windows host that is connected by Wifi and usually the packet loss ranges between 2-20% - when there's no connectivity or congestion issue between host/router/upstream.

However, I've found this issue only seems to manifest when pinging an Internet host such as 1.1.1.1 or 8.8.8.8. For example, a ping -c 1000 -f 192.168.0.1 to the Wifi router always yields 0% loss. But as we can see from the packet capture from the Windows host this is not an upstream or a router issue, since both the 1000 requests and 1000 responses do arrive on the Internet facing network adapter. I've also double verified that there's 0% packet loss between Windows Host and 1.1.1.1 and 8.8.8.8 by confirming with Cygwin and its bundled version of ping.

github-actions[bot] commented 8 months ago

Open similar issues:

Closed similar issues:

Note: You can give me feedback by thumbs upping or thumbs downing this comment.

chanpreetdhanjal commented 8 months ago

Hi. Can you please collect networking logs by following the instructions below? https://github.com/microsoft/WSL/blob/master/CONTRIBUTING.md#collect-wsl-logs-for-networking-issues

nnathan commented 8 months ago

Hi I ran the experiment while collecting networking logs per instruction.

See attached: WslNetworkingLogs-2024-01-10_11-02-34.zip windowshost-2024-01-10_11-02-34.pcap.zip

nnathan commented 7 months ago

I should note that in the above experiment with the attached networking logs, there was a 1.6% packet loss, but in the host pcap there was 0% loss, you can verify that in the tcpdump.log that only 984 (of 1000) ICMP echo reply packets were received by WSL2.

nnathan commented 1 month ago

Is there any update on this?