microsoft / WSL

Issues found on WSL
https://docs.microsoft.com/windows/wsl
MIT License
17.4k stars 819 forks source link

WSL2 localhost access is intermittent with stuck connections #4340

Closed benc-uk closed 4 years ago

benc-uk commented 5 years ago

Please fill out the below information:

Opening a URL served via a Node.js app using localhost, from a Windows browser results in page/URL never loading. It will spin indefinitely trying to load, rather than getting an error that site/page can not be found

Hitting stop and then refresh in browser will result in page loading OK Page will continue to load OK, if you send requests to it quickly, if you wait a few seconds the problem will return and the URL and site will never load

I have verified the following:

This seems to be a TCP socket issue with the way WSL 2 handles the new localhost bridging out to Windows

It is trivial to reproduce. Install Node.js and run the following simple server

const http = require('http')
const requestHandler = (request, response) => {
  console.log(`### Request for ${request.url}`)
  response.end('<h1>Hello Node.js Server!</h1>')
}
const server = http.createServer(requestHandler)
server.listen(3000, '0.0.0.0', (err) => {
  if (err) return console.log('### Something bad happened', err)
  console.log(`### Server is listening on 3000`)
})

See our contributing instructions for assistance.

thomasaull commented 5 years ago

I learned, that you actually can never go back in versions, just forward… so yes :) Doing a clean install is an exception of course

benc-uk commented 5 years ago

I've checked localhost access on build 18980, and I'm not seeing any issues, still seems fixed to me 😃

benhillis commented 5 years ago

If you are still experiencing this on 18980 or later, please let us know some repro steps.

tuananh commented 5 years ago

If you are still experiencing this on 18980 or later, please let us know some repro steps.

weird. i check again today and it's fine now. no code change whatsoever in my project.

thomasaull commented 5 years ago

I checked again with a fresh installed distribution of Ubuntu 18.04 on WSL2, Windows Version 18980. To reproduce:

Create a folder with an index.php with the following contents:

<?php echo "hello world"; ?>

install php with the following command:

sudo apt update
sudo apt install php7.2-cli

use the PHP Dev server like so to serve the index.php (inside the just created folder):

php -S 127.0.0.1:8000

Open Browser on localhost:8000. The site loads but the request never actually resolves (loading indicator keeps spinning forever). Interestingly though, if you open http://localhost:8000/not-there.php the php error page shows up, and the site loads normally.

Ping @benhillis

thomasaull commented 5 years ago

Updated to 18985, issues is still there for me…

stijnherreman commented 5 years ago

I'm also seeing this issue on build 18985.1

Testing with php -S 127.0.0.1:8000 and <?php phpinfo(); ?>, the entire page loads in my browser but the request never completes. Running curl 127.0.0.1:8000/info.php in wsl seems to work fine.

No issue when running with eth0, e.g. php -S 172.30.112.116:8000.

stijnherreman commented 5 years ago

@benhillis if you want, I can send Wireshark captures by email.

jaykilleen commented 5 years ago

Build 18985.1 been wrestling with rails becoming unresponsive via localhost. Found this thread. Glad it's not me! I have a windows update icon flashing at me so I'll try the next build and report back.

jaykilleen commented 5 years ago

18990.1 is still broken for rails development :( will double check to make sure it's not something I'm doing wrong. Also finding webpacker is unable to create a manifest.json on a fresh install. Have tried running rails in the wsl2 using rails server -b 0.0.0.0 -p 3000, as normal, this loads the first time when opening chrome and navigating to localhost:3000 but any changes I make to the homepage does not change on page reload. I still see the server request/response cycle but it serves the original page and not the updated page.

Note: I am also using the new Visual Studio Code Remote-WSL to edit the files.

Since did a fresh install of Ubuntu 18.04 (was 16.04) in WSL2. Installed Rails from scratch (but used my existing repo). Same problem. Might try with a fresh project and share in github as a repro of this issue.

UPDATE I have figured out my issue was that I was installing my files on an external harddrive that was mounted to windows instead of directly in the WSL2 file system. I pushed my repo to github and cloned back down into the WSL2 (using the new Terminal program 😄) ~/projects/appname directory. Spun my server back up and holy gees it is fast! localhost:3000 from the browser works, ended up getting Gaurd livereload up and running and it is soooo nice compared to WSL1. Good luck to the rest of the people in this thread. Apologies if my issue was not entirely related to yours...

FlipperPA commented 5 years ago

I was having this issue at work with Django and localhost with runserver, but I am not having it at home. At home I'm on 18999.1. I'm going to check what version I'm on at work on Monday, but hopefully it is an early version and an upgrade fixes it.

Just in case, here's a minimal solid repro for a Django starter site:

sudo apt -y install python3-pip python3-venv git
python -m venv django_venv
. django_venv/bin/activate
pip install Django
django-admin startproject django_project
cd django_project
python manage.py runserver 0:8000

Then bring up http://localhost:8000/ in your browser and refresh a bunch of times over a few minutes.

vasekboch commented 5 years ago

This is still issue on build 18999. It causes various issues. For example Jetbrains PHPStorm has trouble with receiving data from Docker.

The issue seems to be related to the way how WSL2 (wslhost.exe) handles closed connections. Seems that when the connection in Linux is closed, the connection with wslhost.exe is stucks in Established state.

The connections in Ubuntu (there is no active one) image

There is the established connection with Docker in Windows: image Now the application is still waiting for some data, but the data never arrive, because the connection should be closed.

When i manually close the connection via TCPView, the application finished like it should at the begining.

craigloewen-msft commented 5 years ago

@FlipperPA I tried to repro your scenario by following your steps. I created the site and was able to connect to it. I tried refreshing about 50 times and each connection was successful. Could you try the update and see if that resolves your issue?

vasekboch commented 5 years ago

@craigloewen-msft I was able to reproduce it somehow with python -m SimpleHTTPServer 8000 I tried two browsers old Edge and new chromium Edge.

craigloewen-msft commented 5 years ago

@vasekboch I was able to repro it with that workflow, thanks for sharing it! We'll dive into why this is happening.

vasekboch commented 5 years ago

@craigloewen-msft Cool, happy to help. I think, that its related to the way how wslhost.exe handles the connection. Seems that the connection stays in established state even after the connection in WSL is already closed. And when you make the second request the first established connection is closed and that is causing the spinner to stop spining. Or something similar to that. And some apps, that are sensitive to the fact, that the connection needs to be closed are broken. Aka the spinning spinner or my issue with PHPStorm, that I mentioned earlier.

benc-uk commented 5 years ago

Issue has definitely returned on 18999 It's the same flip / flop between the request working and failing, and if you make your requests quick enough (less than 5 secs apart) then the requests will work

The pattern is identical as it was before

benhillis commented 5 years ago

@benc-uk - I'm having a very hard time debugging this internally, i just cannot seem to get a repro. "Refresh your web browser" is a difficult repro with a lot of variables. Is there a targeted test that could be written so I can try to debug this further?

vasekboch commented 5 years ago

@benc-uk I was researching it a little bit more today. And I still don't have way how to reproduce it better. And now I think that the way that I described earlier is not correct. Because the python server is single threaded, and the chromium keeps the connection open. So that why the other browser freezes. So sorry for that. And also the steps to reproduce are missing one key component. And that it works on direct non localhost IPv4 IP, and freezes on localhost. And I couldn't reproduce that. I tried switching python for (https://www.npmjs.com/package/http-server) and issue is not present. So I'm at the begining of debug process now.

I know that the bug still persists. Because PHPStorm's remote docker executor is broken and when I change the host from localhost to direct IP it works as expected. So there is something happening but I don't know what is it exacly. I think that it worked on 18970 or 18975, but I'm not sure which one. I will let you know, if I find something more.

vasekboch commented 5 years ago

@benhillis @craigloewen-msft Hey. I finally managed to replicate the bug in more controlled way (aka without browser). Actually it happened to me by a "happy" accident, but seems consistent. I have installed Docker on Ubuntu from the official repository.

1) Run in WSL2 docker run --rm -it -p8080:8080 adminer:standalone 2) In Windows curl 172.27.188.175:8080 (use IP from WSL2 eth0 adapter) 3) In Windows curl localhost:8080

The behaviour on my PC is following. 2) finishes without problem. 3) hangs and the end. Seems that the connection is not closed, but the data are returned correctly.

Hoping that this will finally help to nail this bug.

benhillis commented 5 years ago

@vasekboch - Thanks, I will give this a try today.

benhillis commented 5 years ago

@vasekboch - I have this running in a loop and I have yet to see a hang.

vasekboch commented 5 years ago

@benhillis For me It hangs on first request. See. https://1drv.ms/v/s!An2NRObJbs8nvYVtoVWv-wwyuh0FFA?e=kZco7j

benhillis commented 5 years ago

@vasekboch - thanks for the video, that's exactly what I'm doing locally and it is not hanging for me. I did make a couple changes in this path recently so maybe I fixed the issue.

I will try on 19002 and see if I can get a repro there.

vasekboch commented 5 years ago

That seems promising. I've tested clean install of Fedora and the curl still hangs. My colleague has a same issue. So its not related only to my machine.

benhillis commented 5 years ago

@vasekboch - I have some great news. I installed 19002 and was able to repro the issue, I then applied my fix to the build, and the issue goes away. The fix should be in an insider build in a week or two.

benhillis commented 5 years ago

@vasekboch - Thank you very much for the targeted repro, that made things much easier.

vasekboch commented 5 years ago

@benhillis No problem. Happy to help. I'm glad that this one is fixed. It was pain to fill out every time new IP adress.

Thank you for the fix. I check it a soon as it will go live. :)

benhillis commented 4 years ago

19018 should have the final fixes for this.

FlipperPA commented 4 years ago

@benhillis I've just upgraded to 19023 and this seems to be happening again. I run Django's runserver on port 8000 on Ubuntu 18. I had the intermittent problem, then it worked fine for a while, now it isn't working at all. I've tried several Windows reboots / restarts of Ubuntu.

If I go directly to the IP of the guest, 172.20.15.250, it comes up.

A picture is worth a thousand words:

image

benhillis commented 4 years ago

@FlipperPA - Could you give me a set of command line repro steps I could try? Also would you mind taking a trace?

https://github.com/microsoft/WSL/blob/master/CONTRIBUTING.md#8-detailed-logs

stefankummer commented 4 years ago

Well something weird for me on 19018.1. Every first start of windows (after computer turned off), I am unable to access to http port (apache server started) via localhost, only via the IP of WSL2. After restarting computer, I can access without restriction to localhost webserver.

FlipperPA commented 4 years ago

@benhillis @stefankummer I'll see if I can come up with a solid repro, but it is now working... which is especially weird, because I had tried restarting Ubuntu 18 and restarting all of Windows before posting above. I'll report back if I can get it to happen again!

laem commented 4 years ago

Serving files for me (express or http.server, doesn't matter) on 19030 is extremely slow on localhost (you can see images charging like with old 56k modems). When I use the eth0 ip, it's super fast.

benc-uk commented 4 years ago

Just updated to 19536 and this seems broken again localhost simply not working, but using the WSL IP I get a connection

I thought we had this one nailed! :(

benc-uk commented 4 years ago

I think my latest issue is actually this https://github.com/microsoft/WSL/issues/4769

If I "reboot" WSL then then first connection/service binding works, all other services after the first don't work

benhillis commented 4 years ago

@benc-uk - fix is inbound.

GameBurrow commented 4 years ago

Seems like I have similar issue with Apache in Docker (for WSL 2) on build 19041.21 (slow insider ring) When I had 127.0.0.1 on windows hosts only first response from server was gotten (index file, nothing else (js and css) got through).

When I replaced that with the WSL IP, everything started working again.

khuongduybui commented 4 years ago

Port forwarding on build 19541.1000 (slow insider channel) is also broken on first boot. When I issue wsl --shutdown and start the distro again, it works.

benc-uk commented 4 years ago

Closing for now as #4769 has been closed

Himakar-PV commented 4 years ago

Hi, I downloaded Microsoft Windows [Version 10.0.19592.1001] through insider (Fast)

It has vhost issue as explained below :

I have two projects

Laravel (var/www/html/laravel/..) Pimcore (var/www/html/pimcore/..) I have two configs under /etc/apache2/sites-available

laravel.dev.config pimcore.dev.config NOTE : Both are not enabled yet

In hosts file (/etc/hosts) I added below lines : 127.0.0.1 laravel.dev 127.0.0.1 pimcore.dev ::1 pimcore.dev ::1 laravel.dev

Now in browser when I type "localhost", it loads default Apache page.

Scenario 1 : Run below commands sudo a2ensite laravel.dev.config sudo service apache2 restart Type localhost in browser it loads Laravel project Type laravel.dev in browser, loads nothing

Scenario 2 : Run below commands sudo a2dissite laravel.dev.config sudo a2ensite pimcore.dev.config sudo service apache2 restart Type localhost in browser it loads pimcore project Type pimcore.dev in browser, loads nothing

Scenario 3 : Run below commands sudo a2ensite pimcore.dev.config sudo a2ensite laravel.dev.config sudo service apavhe2 restart Type localhost in browser it loads Laravel project Type laravel.dev in browser, loads nothing Type pimcore.dev in browser, loads nothing. I wonder how localhost is loading one of the projects by default !!? In scenario 3 if I enable pimcore.dev.config after laravel.dev.config then localhost is loading Laravel project instead of pimcore by default !!!

I am ready to screen share if anyone is ready to help. Thank you.

firrae commented 3 years ago

I am also having this issue still. I tried stopping and restarting WSL like @khuongduybui mentioned and now it works with Rails. This always seems to happen after I shut down the computer and come back to it.

GiancaDIFI commented 3 years ago

I've been having this issue since I installed docker-desktop

vasekboch commented 3 years ago

@Himakar-PV Its not related to this issue. You need to setup your hosts in Windows c:\Windows\System32\drivers\etc\hosts Then it will work. AFAIK Apache select one active project as default. So thats why it loads something.

fsoikin commented 2 years ago

I see this issue too. Started happening since the last update to 22538. Rolling back to 22533 seems to have fixed it for now.

fmiqbal commented 2 years ago

I see this issue too. Started happening since the last update to 22538. Rolling back to 22533 seems to have fixed it for now.

Same here on same version, happening on PHP (73, 74, 80) vanilla built-in webserver, bound to 127.0.0.1 or 0.0.0.0 port 8000 (or anything)

fsoikin commented 2 years ago

The issue seems to have gone away again after upgrading to 22543

Edit: nope, still happening

clovis1122 commented 2 years ago

Happening to me as well.

C:\Users\clovi> wsl --version
WSL version: 0.51.2.0
Kernel version: 5.10.81.1
WSLg version: 1.0.30
Windows version: 10.0.22543.1000
fsoikin commented 2 years ago

@clovis1122 I have determined this is a bug in the latest version of WSL, the one available via the Microsoft Store.

But the good news is, an older version is available - the one bundled with Windows, - and it does not have this issue.

To install the bundled (aka "in-box") version, you need to (1) explicitly uninstall the Microsoft Store version, and then (2) wsl --install --inbox.

You'll know you got the right version if it doesn't have a --version command:

>  wsl --version
Invalid command line option: --version
Copyright (c) Microsoft Corporation. All rights reserved.

Usage: wsl.exe [Argument] [Options...] [CommandLine]
...
illepic commented 1 year ago

Just noting that this is still an issue.


PS C:\WINDOWS\system32> wsl --version
WSL version: 1.1.0.0
Kernel version: 5.15.83.1
WSLg version: 1.0.48
MSRDC version: 1.2.3770
Direct3D version: 1.608.2-61064218
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.22621.1194```