microsoft / WSL

Issues found on WSL
https://docs.microsoft.com/windows/wsl
MIT License
17.51k stars 823 forks source link

Port forwarding repeated failure on WSL 1.1.3 #9751

Open Dune4 opened 1 year ago

Dune4 commented 1 year ago

Windows Version

10.0.22000.1455

WSL Version

1.1.3.0

Are you using WSL 1 or WSL 2?

Kernel Version

5.15.90.1

Distro Version

No response

Other Software

DDEV

Repro Steps

With WSL 1.1.0 (recently pushed to Store although marked as pre-release), port forwarding fails repeatedly. Start Apache on the Ubuntu distro, and from Command Prompt on Windows, try to: telnet localhost 80 It will work. A few seconds later, repeat the telnet command and it will fail. Port forwarding no longer works to connect to WSL Ubuntu's running services.

Expected Behavior

telnet command should keep working on the port.

Actual Behavior

telnet command will timeout.

On first try: netstat -an | findstr /c:"80" | findstr /c:"LISTENING" Shows port 80 as Listening.

After a few seconds, repeat the netstat command, the port is no longer listed. This applies to any service running on WSL, and not just Apache. Port forwarding fails after a few seconds of the service going up.

Restart apache "service apache2 restart". The port will appear on netstat. Wait 10 seconds and check again. It disappears.

Diagnostic Logs

No response

ghost commented 1 year ago

/logs please. I can not repro this. You mention two different versions in the bug. 1.1.3 and 1.1.0. 1.1.0 is already known to be problematic.

microsoft-github-policy-service[bot] commented 1 year ago

Hello! Could you please provide more logs to help us better diagnose your issue?

To collect WSL logs, download and execute collect-wsl-logs.ps1 in an administrative powershell prompt:

Invoke-WebRequest -UseBasicParsing "https://raw.githubusercontent.com/microsoft/WSL/master/diagnostics/collect-wsl-logs.ps1" -OutFile collect-wsl-logs.ps1
Set-ExecutionPolicy Bypass -Scope Process -Force
.\collect-wsl-logs.ps1

The scipt will output the path of the log file once done.

Once completed please upload the output files to this Github issue.

Click here for more info on logging

Thank you!

microsoft-github-policy-service[bot] commented 1 year ago

Hello! Could you please provide more logs to help us better diagnose your issue?

To collect WSL logs, download and execute collect-wsl-logs.ps1 in an administrative powershell prompt:

Invoke-WebRequest -UseBasicParsing "https://raw.githubusercontent.com/microsoft/WSL/master/diagnostics/collect-wsl-logs.ps1" -OutFile collect-wsl-logs.ps1
Set-ExecutionPolicy Bypass -Scope Process -Force
.\collect-wsl-logs.ps1

The scipt will output the path of the log file once done.

Once completed please upload the output files to this Github issue.

Click here for more info on logging

Thank you!

PaulBurridge commented 1 year ago

I have a open bug with vscode could this be related? https://github.com/microsoft/vscode-remote-release/issues/8154#issuecomment-1459889181****

rimifon commented 1 year ago

Recently, after upgrading the kernel, I encountered this problem when using WSL2: When running docker images in WSL2 and starting Apache inside Docker, it prompts that port 80 is already in use, but using netstat -pant shows that port 80 is not in use. After stopping IIS with iisreset /stop and entering Docker again, Apache can be successfully started. Moreover, there is no port occupation problem when starting IIS again.

PaulBurridge commented 1 year ago

Seems to me that WSL is trying to forward a port inside a docker container (docker service in WSL host) to Windows even if you have not exposed the port to the WSL host itself, this obviously fails as does not actually work.

In theory It should only forward ports exposed to the WSL host.

PaulBurridge commented 1 year ago

Evidence below of WSL attempting to expose a port that is only open within the container

Docker service is installed inside WSL container (tested with Ubuntu and Debian)

Note no port exposed in docker command

Steps to reproduce:

>wsl --shutdown

>netstat -ano | find "LISTENING" | find ":80"

>wsl -e echo "WSL started"
WSL started

>netstat -ano | find "LISTENING" | find ":80"

>wsl -e docker run -d nginx
ea266215b1c431023655824e8a980fe2161e3c55700a03466f924c4ba11e82f2

>netstat -ano | find "LISTENING" | find ":80"
  TCP    127.0.0.1:80           0.0.0.0:0              LISTENING       19820
  TCP    [::1]:80               [::]:0                 LISTENING       19820

>tasklist /fi "PID eq 19820"

Image Name                     PID Session Name        Session#    Mem Usage
========================= ======== ================ =========== ============
wslhost.exe                  19820 Console                    3      7,500 K

>curl http://localhost:80
curl: (7) Failed to connect to localhost port 80 after 2239 ms: Connection refused

>wsl -e curl http://localhost:80
curl: (7) Failed to connect to localhost port 80: Connection refused

# A short while later the port dissapears

>netstat -ano | find "LISTENING" | find ":80 "                                                                                

>wsl --version
WSL version: 1.1.3.0
Kernel version: 5.15.90.1
WSLg version: 1.0.49
MSRDC version: 1.2.3770
Direct3D version: 1.608.2-61064218
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.22000.1574
zcobol commented 1 year ago

@PaulBurridge-Nasstar WSL defaults to localhostForwarding=true. If you want to prevent 127.0.0.1:80 from being shared with Windows set localhostForwarding=false in .wslconfig

PaulBurridge commented 1 year ago

@PaulBurridge-Nasstar WSL defaults to localhostForwarding=true. If you want to prevent 127.0.0.1:80 from being shared with Windows set localhostForwarding=false in .wslconfig

Thanks that is helpful as a work around for my use case. (confirmed working see https://learn.microsoft.com/en-us/windows/wsl/wsl-config on how to configure this needs to be in a [wsl2] block)

However this still should be fixed as it should not be publishing the internal port of docker containers, which will never work and will cause other issues such as breaking VSCode dev containers. (see https://github.com/microsoft/vscode-remote-release/issues/8154#issuecomment-1459889181****)

Also it is likely that some users need the forwarding on for WSL services but intend to use docker containers etc.

This issue is new to the latest release (since I recently updated) and did not occur before.

ben-childs-docusign commented 1 year ago

We also repro this issue in Windows Server 2022,

simple repro is to run az login which fails with 1.1.13 due to port forwarding not working, it succeeds when we roll back to 1.0.3

minikube is broken due to an inability to connect to ssh ports which are forwarded by docker desktop, they are accessible from the host (windows) but not from the WSL vm.

There was no repro on Windows 11

ghost commented 1 year ago

@PaulBurridge-Nasstar, Please open a new bug and post logs. It sounds like you're describing a different issue, just not sure what.

And when you do can you give me the output of ss -lntp from wsl? From both before and after the port disappears? Logs would also help me here. I'm surprised to hear the behavior your describing, because that's consistent with the behavior of 1.1.2, but not something I was able to reproduce with 1.1.3.

The way the relay is intended to work. IPv4 example: Guest listens on 0.0.0.0:PORT or 127.0.0.1:PORT. Agent on host listens on 127.0.0.1:PORT, accepting connections, forwarding data to the service within the guest.

ghost commented 1 year ago

@ben-childs-docusign, I created a server sku vm. Installed 1.1.3 with ubuntu. I've never used az login before, but it seemed to work. I used az account list and it listed the groups I'm a part of.

Without /logs or repro I can't do much.

microsoft-github-policy-service[bot] commented 1 year ago

Hello! Could you please provide more logs to help us better diagnose your issue?

To collect WSL logs, download and execute collect-wsl-logs.ps1 in an administrative powershell prompt:

Invoke-WebRequest -UseBasicParsing "https://raw.githubusercontent.com/microsoft/WSL/master/diagnostics/collect-wsl-logs.ps1" -OutFile collect-wsl-logs.ps1
Set-ExecutionPolicy Bypass -Scope Process -Force
.\collect-wsl-logs.ps1

The scipt will output the path of the log file once done.

Once completed please upload the output files to this Github issue.

Click here for more info on logging

Thank you!

Dune4 commented 1 year ago

Hi,

Logs are attached here, two of my other colleagues are also having the same issue (personal devices, so no corporate image etc.)

WslLogs-2023-03-07_12-41-51.zip

Downgrading to 1.0.3 resolves the issue.

When replicate the issue here I was using docker, with WSL2 backend enabled. I can confirm with netstat the port it not visible until the docker container starts. The issue is very reproducible.

Docker Desktop 4.17.0 (99724)

ghost commented 1 year ago

@Dune4, my point is. I don't have a repro. I saw no mention of docker on your original post. So I created an ubuntu distro, enabled systemd, installed apache2 on it, it seemed to work fine.

I'm happy to look into this! But you need to tell me what you're trying to do :( Do I doubt you're experiencing an issue? Nope. But, you've got to work with me and help me out here.

Can you show me what's exposed on the guest with ss -lntp that's not exposed on the host netstat -an | findstr /c:"PORT-NUMBER-HERE" | findstr /c:"LISTENING" ?

Thank-you

ghost commented 1 year ago

These are the ports I see being bound by the relay in the logs:

image
microsoft-github-policy-service[bot] commented 1 year ago

Hello! Could you please provide more logs to help us better diagnose your issue?

To collect WSL logs, download and execute collect-wsl-logs.ps1 in an administrative powershell prompt:

Invoke-WebRequest -UseBasicParsing "https://raw.githubusercontent.com/microsoft/WSL/master/diagnostics/collect-wsl-logs.ps1" -OutFile collect-wsl-logs.ps1
Set-ExecutionPolicy Bypass -Scope Process -Force
.\collect-wsl-logs.ps1

The scipt will output the path of the log file once done.

Once completed please upload the output files to this Github issue.

Click here for more info on logging

Thank you!

Dune4 commented 1 year ago

Hi @pmartincic

Apologies, I had already spent ages messing around trying to fix this and deleting containers and what not before I opened at the time. So this is more detail that in all honesty should have been in my initial post.

One thing I do want to out is that...restarting windows seems to solve this intermittently, if its broken, then I restart windows it may or may not come back okay,

I am using a few different contains to expose some ports for web development. The error I often see when this is not working is Error response from daemon: Ports are not available: exposing port TCP 127.0.0.1:32792 -> 0.0.0.0:0: listen tcp 127.0.0.1:32792: bind: Only one usage of each socket address (protocol/network address/port) is normally permitted.

No the key thing to note is that the error message here does seem to change the port number...so its the same error message but sometimes it may complain about 80 or 443 instead of 32792 as above....it almost feels like some sort of race condition.

For my development environment I am using DDEV (https://ddev.com/) which just save times provisioning the correct containers for me depending on the type of web project I plan to to use.

When testing today, in a working scenario I have logs here aswell attached. Are you able to compare the previous logs to these logs to see if anything sticks out?

WslLogs-2023-03-08_21-00-46.zip

ghost commented 1 year ago

Can you show me what's exposed on the guest(wsl) with ss -lntp that's not exposed on the host(windows) netstat -an | findstr /c:"PORT-NUMBER-HERE" | findstr /c:"LISTENING" ? As a sanity check.

Dune4 commented 1 year ago

Error response from daemon: Ports are not available: exposing port TCP 127.0.0.1:443 -> 0.0.0.0:0: listen tcp 127.0.0.1:443: bind: Only one usage of each socket address (protocol/network address/port) is normally permitted.'

ss -lntp State Recv-Q Send-Q Local Address:Port Peer Address:Port Process LISTEN 0 4096 127.0.0.53%lo:53 0.0.0.0: LISTEN 0 128 0.0.0.0:22 0.0.0.0: LISTEN 0 128 [::]:22 [::]: LISTEN 0 4096 :61753 :

netstat -an | findstr /c:"443" | findstr /c:"LISTENING"

(No output on netstat)

The port does show up temporarily on netstat for a few seconds then dissapeaers.

ghost commented 1 year ago

I'm sorry, I've gotten routed elsewhere and won't be able to look into this for the time being.

cmendible commented 1 year ago

az login redirect fails with 1.1.3, 1.1.5, 1.1.6 & 1.17

Had to rolback to 1.0.3

Edition Windows 11 Pro Version 22H2 Installed on ‎3/‎10/‎2023 OS build 22624.1391 Serial number 003038703353 Experience Windows Feature Experience Pack 1000.22639.1000.0

ahmed2m commented 1 year ago

If anyone looking on how to downgrade: To downgrade, just download the msixbundle from 1.0.3 release and run the following in an Admin Powershell where the file was downloaded:

$Package = Get-AppxPackage MicrosoftCorporationII.WindowsSubsystemforLinux Remove-AppxPackage $Package Add-AppxPackage .\Microsoft.WSL_1.0.3.0_x64_ARM64.msixbundle

To prevent windows from updating it on its own, disable "Receive updates on other microsoft products" in Windows Updates settings

stetime commented 1 year ago

still failing under 1.1.6.0

PaulBurridge commented 1 year ago

@PaulBurridge-Nasstar WSL defaults to localhostForwarding=true. If you want to prevent 127.0.0.1:80 from being shared with Windows set localhostForwarding=false in .wslconfig

Has everyone who are still having issues tried this? It prevents WSL auto forwarding of WSL ports which can clash with VSCode's own port forwarding due to bug (https://github.com/microsoft/WSL/issues/9763).

Note the setting needs to be inside of block [wsl2] in the config file.

olivierchatry commented 1 year ago

This helps, at least minikube is starting now. Only issue is that then all "localhost" request are not resolved. This really need a fix, as it basically make minikube unusable :(

PaulBurridge commented 1 year ago

This helps, at least minikube is starting now. Only issue is that then all "localhost" request are not resolved. This really need a fix, as it basically make minikube unusable :(

If you attach VSCode to the minikube instance VSCode's forwarding may work instead of WSL's? Not elegant but may work.

olivierchatry commented 1 year ago

Maybe, but I reverted to 1.0.3 :/ hopefully it will get fixed soon !

stetime commented 1 year ago

I rolled back to 1.0.3 but this persists for me

ahmed2m commented 1 year ago

I was on 1.0.3 for weeks and not having this issue. And now it's back. This is so frustrating! Does Microsoft even know that WSL is mainly used for development?! This has been a nightmare and productivity sink, seriously considering going back to the real thing.

ben-childs-docusign commented 1 year ago

We have observed that this is fixed in the latest version of WSL 1.2.5