microsoft / WSL

Issues found on WSL
https://docs.microsoft.com/windows/wsl
MIT License
17.46k stars 821 forks source link

WSL2 with networkingMode=mirrored and multiple Windows users fails after first user successfully provisions networking. #11015

Closed idatum closed 10 months ago

idatum commented 10 months ago

Windows Version

Microsoft Windows version 22631.3007

WSL Version

2.0.14.0

Are you using WSL 1 or WSL 2?

Kernel Version

5.15.133.1-microsoft-standard-WSL2

Distro Version

Debian 12.4

Other Software

WSLg version 1.0.59

Repro Steps

With a Windows 11 device with at least 2 separate users -- each user with its own WSL2 distro installation -- launch WSL2 on the first user with networkingMode=mirrored enabled in .wslconfig. This should correctly start networking, including IPv6.

With the first user still logged in, switch users and start WSL2 on that separate user and separate WSl2 distro, with the same networkingMode setting in .wslconfig. The following error occurs for the 2nd user: The operation timed out because a response was not received from the virtual machine or container. Error code: Wsl/Service/CreateInstance/CreateVm/HCS_E_CONNECTION_TIMEOUT

[process exited with code 4294967295 (0xffffffff)]

It doesn't matter which user's WSL2 I start first.

Expected Behavior

Expectation is networkingMode=mirrored works for all users simultaneously logged in.

Actual Behavior

Only the first user's WSL2 networking works with mirrored.

Diagnostic Logs

Log Name: System Source: Microsoft-Windows-Hyper-V-VmSwitch Date: 1/12/2024 11:52:06 Event ID: 32 Task Category: (1022) Level: Error Keywords: (128) User: S-1-5-83-1-1555666032-1110446901-1052638878-2743957777 Computer: computername Description: Failed to connect NIC 5CB99470-1335-4230-9EFE-BD3E11798DA3--03E62423-E72E-4EB5-A235-1B2047585C98 (Friendly Name: ) to port D6C7BE28-647D-46BC-94FC-49CD493D51E9 (Friendly Name: ) on switch 30BE601B-A2AB-4EDC-9AD5-9D2600CF7CF0 (Friendly Name: ), status = Unknown NTSTATUS Error code: 0xc0010022. UniqueEvent = 100. Event Xml:

32 0 2 1022 0 0x8000000000000080 598499 System computername 3221291042 74 5CB99470-1335-4230-9EFE-BD3E11798DA3--03E62423-E72E-4EB5-A235-1B2047585C98 0 36 D6C7BE28-647D-46BC-94FC-49CD493D51E9 1 36 30BE601B-A2AB-4EDC-9AD5-9D2600CF7CF0 1 100 Log Name: System Source: Microsoft-Windows-Hyper-V-VmSwitch Date: 1/12/2024 11:52:06 Event ID: 35 Task Category: (1022) Level: Error Keywords: (128) User: S-1-5-83-1-1555666032-1110446901-1052638878-2743957777 Computer: computername Description: Failed to connect NIC 5CB99470-1335-4230-9EFE-BD3E11798DA3--03E62423-E72E-4EB5-A235-1B2047585C98 (Friendly Name: ) to port D6C7BE28-647D-46BC-94FC-49CD493D51E9 (Friendly Name: D6C7BE28-647D-46BC-94FC-49CD493D51E9) on switch 30BE601B-A2AB-4EDC-9AD5-9D2600CF7CF0 (Friendly Name: FSE Switch). The task was vetoed by a switch extension, or the switch extension stack is corrupted. Status = Unknown NTSTATUS Error code: 0xc0010022. Event Xml: 35 0 2 1022 0 0x8000000000000080 598498 System computername 3221291042 74 5CB99470-1335-4230-9EFE-BD3E11798DA3--03E62423-E72E-4EB5-A235-1B2047585C98 0 36 D6C7BE28-647D-46BC-94FC-49CD493D51E9 36 D6C7BE28-647D-46BC-94FC-49CD493D51E9 36 30BE601B-A2AB-4EDC-9AD5-9D2600CF7CF0 10 FSE Switch 0
github-actions[bot] commented 10 months ago

Hi I'm an AI powered bot that finds similar issues based off the issue title.

Please view the issues below to see if they solve your problem, and if the issue describes your problem please consider closing this one and thumbs upping the other issue to help us prioritize it. Thank you!

Open similar issues:

Closed similar issues:

Note: You can give me feedback by thumbs upping or thumbs downing this comment.

ghost commented 10 months ago

/logs please

idatum commented 10 months ago

I included logs for the second user's instance of WSL2 running (after successfully starting WSL2 in a separate user session). Unfortunately, I could not get a full repo with the same error. What I do see is a failure to create NICs for the second instance: ~$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever

The first user's instance has the correct networking configured (including IPv6 on eth0). WslLogs-2024-01-16_13-21-59.zip

idatum commented 10 months ago

Not directly related to this issue, but I noticed another networking issue with WSL2. Let me know if you want me to open a separate issue with logs: If I have a virtual network switch created on same physical NIC as a WSL2 instance using networkingMode=mirrored, I never get an IPv4 nor IPv6 address. It works fine though for a WSL2 instance using networkingNode=nat -- I get an IPv4 address.

ghost commented 10 months ago

I don't fully understand the second issue you're describing, but if you file it separately that'd be awesome.

keith-horton commented 10 months ago

"With a Windows 11 device with at least 2 separate users -- each user with its own WSL2 distro installation -- launch WSL2 on the first user with networkingMode=mirrored enabled in .wslconfig. This should correctly start networking, including IPv6.

With the first user still logged in, switch users and start WSL2 on that separate user and separate WSl2 distro, with the same networkingMode setting in .wslconfig. The following error occurs for the 2nd user: The operation timed out because a response was not received from the virtual machine or container. Error code: Wsl/Service/CreateInstance/CreateVm/HCS_E_CONNECTION_TIMEOUT"

Sorry, there's a limitation with Windows that only 1 WSL instance can be run with mirroring enabled at any one point in time - any other WSL instance will need to have NAT configured.

Sorry about this :(

idatum commented 10 months ago

Thanks for confirming I need nat mode for other instances.

Ezi1h commented 7 months ago

"With a Windows 11 device with at least 2 separate users -- each user with its own WSL2 distro installation -- launch WSL2 on the first user with networkingMode=mirrored enabled in .wslconfig. This should correctly start networking, including IPv6.

With the first user still logged in, switch users and start WSL2 on that separate user and separate WSl2 distro, with the same networkingMode setting in .wslconfig. The following error occurs for the 2nd user: The operation timed out because a response was not received from the virtual machine or container. Error code: Wsl/Service/CreateInstance/CreateVm/HCS_E_CONNECTION_TIMEOUT"

Sorry, there's a limitation with Windows that only 1 WSL instance can be run with mirroring enabled at any one point in time - any other WSL instance will need to have NAT configured.

Sorry about this :(

@keith-horton My actual test was that two instances started. The first instance was used in Docker Desktop and the results were as expected. The second instance had native docker installed directly in the distribution, and it seemed to have some issues with the network. In the second instance, In the second instance, when the docker container port was exposed, the host machine cannot use host IP: port or localhost:port to access the container, and other machines on the local network cannot use host IP: port to access the container.

Are the limitations you mentioned a networking problem for the situation I tested? Looking forward to your reply, thx!

keith-horton commented 7 months ago

Hi there. Can you expand on how you had 2 instances running? They were running at the same time? Were they in the context of 2 different users?

If they are both configured to have mirroring enabled, networking will fail for the 2nd container. Though while networking is broken, any ports claimed by the 2nd port may still be reserved - which means they are reserved for the entire machine - the host + all containers.

Does that help?

Ezi1h commented 7 months ago

Hi there. Can you expand on how you had 2 instances running? They were running at the same time? Were they in the context of 2 different users?

If they are both configured to have mirroring enabled, networking will fail for the 2nd container. Though while networking is broken, any ports claimed by the 2nd port may still be reserved - which means they are reserved for the entire machine - the host + all containers.

Does that help?

I tested it with two distributions, Arch and Ubuntu, in the same user environment, while launching them both successfully booted to the terminal, with Ubuntu being selected as the WSL integration in the Docker desktop configuration.

192.168.1.2 is the LAN address of my windows host. The test steps are as follows:

  1. Start the Arch distribution, install native docker, start a nginx container and expose port 80, which was accessible from another host on LAN. curl http://192.168.1.2
  2. I then started the Docker Desktop and also started a nginx container on Ubuntu exposing port 8080. If I were to access this container from another host on LAN, it was accessible. curl http://192.168.1.2:8080
  3. I tried again to access the host on another LAN network and found that it was not working. curl http://192.168.1.2
  4. I have compared the iptables in two distributions and found them to be identical. I am uncertain if the Docker Desktop mechanism with WSL is the same as the native Docker mechanism within WSL. Could this be the cause of their networking issues? I am also curious if the new version will consider supporting multiple WSL instances to work normally after enabling mirrored networking.
    iptables -t nat -S
    -P PREROUTING ACCEPT
    -P INPUT ACCEPT
    -P OUTPUT ACCEPT
    -P POSTROUTING ACCEPT
    -N DOCKER
    -A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
    -A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
    -A POSTROUTING -s 172.16.0.0/24 ! -o docker0 -j MASQUERADE
    -A POSTROUTING -s 172.16.0.2/32 -d 172.16.0.2/32 -p tcp -m tcp --dport 80 -j MASQUERADE
    -A DOCKER -i docker0 -j RETURN
    -A DOCKER ! -i docker0 -p tcp -m tcp --dport 80 -j DNAT --to-destination 172.16.0.2:80
Morrigan-Ship commented 1 month ago

its now about WSL its about hyper-v adaptors

I fixed it: