microsoft / WSL

Issues found on WSL
https://docs.microsoft.com/windows/wsl
MIT License
17.46k stars 822 forks source link

Launching executables from WSL that touch filesystem makes new process on windows host very slow for many minutes, then gets back to normal or a reboot is necessary #11383

Open giovannicandido opened 7 months ago

giovannicandido commented 7 months ago

Windows Version

Microsoft Windows [Version 10.0.22631.3155]

WSL Version

2.1.5.0

Are you using WSL 1 or WSL 2?

Kernel Version

5.15.146.1-2

Distro Version

No response

Other Software

No response

Repro Steps

Not sure what the issue is, but after extensive testing, executables that do not interact with the Windows filesystem work as expected. To reproduce there is a couple of ways, one is as this:

Install psql.exe on windows (you can use scoop install postgresql) Launch a database on windows host (the best way to do it is to run on docker):

docker run --rm -it -e POSTGRES_PASSWORD=1234 -p 5432:5432 postgres:16

Now open a terminal with wsl (any distro affected, I tested in both native wsl and in the new one from microsoft store)

connect to the database (from the linux wsl instance):

psql.exe -U postgres -h 127.0.0.1

password is 1234

Everything working fine, now you will probably crash your system:

In the psql shell that you opened, type the following:

\l

This will list the databases. It will say something like that:

psql (16.2) Digite "help" para obter ajuda.

postgres=# \l '\wsl.localhost\Fedora\home\giova' CMD.EXE was started with the above path as the current directory. UNC paths are not supported. Defaulting to Windows directory. Lista de bancos de dados Nome | Dono | Codificação | Provedor de localidade | Ordenação | Ctype | Localidade ICU | Regras ICU | Privilégios de acesso -----------+----------+-------------+------------------------+------------+------------+----------------+------------+----------------------- postgres | postgres | UTF8 | libc | en_US.utf8 | en_US.utf8 | | | template0 | postgres | UTF8 | libc | en_US.utf8 | en_US.utf8 | | | =c/postgres + | | | | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | libc | en_US.utf8 | en_US.utf8 | | | =c/postgres + | | | | | | | | postgres=CTc/postgres (3 linhas)

Now try to launch any new executable (browser, terminal tab, windows explorer, anything). Even the task manager will hang on for many minutes before it open. Note: Existing process that are opened are not affect, you need to start new ones.

After 20 minutes more or less, it should return back to normal.

Second note: Process usage is low, memory usage is normal, disk usage is normal, network usage normal. Nothing abnormal happing in process explorer as far as understand.

I try with 1password cli and after it read the secrets the same behaviour happen. Let me know if anyone need other way to reproduce, this should affect more executables.

Third note: Its not esporadic. I try many configurations in wsl, (systemd enable, disabled, memory allocation, mirror mode network, many...) It happens all dozens of times I reproduced.

This video shows the problem: https://1drv.ms/v/s!AoHvV-Rb6N9QiIg9r1_hzSwy35Z1fw?e=12UWSY

The video above use 1passwor cli op, that is another affected executable.

Expected Behavior

After execution of windows process the system keeps behaving normal

Actual Behavior

After execution of windows executable in WSL the system behaves worse than if I have installed windows 11 in a 10mb 1core pentium II system.

Diagnostic Logs

WslLogs-2024-03-26_23-30-42.zip

github-actions[bot] commented 7 months ago
Diagnostic information ``` .wslconfig found Custom kernel command line found: 'sysctl.net.ipv4.ping_group_range="0 2147483647"' appxpackage.txt not found ```
danarnold commented 7 months ago

I'm seeing similar behavior, using win32yank as the only process that would have anything to do with Windows inside WSL. It takes way longer than normal to copy and paste using win32yank, and it slows down seemingly worse each time the command runs, eventually lagging out to the point where each character typed takes over a second to appear on the screen. Process managers within WSL like htop don't show any process taking up much CPU, and win32yank.exe itself will have finished executing, but the WSL process itself will show up in Windows's Task Manager as using 90-100% CPU until WSL itself is shut down and restarted.

For me, this behavior started happening immediately after I updated my PC to KB5035942.

WSL version: 2.1.5.0 Kernel version: 5.15.146.1-2 WSLg version: 1.0.60 MSRDC version: 1.2.5105 Direct3D version: 1.611.1-81528511 DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp Windows version: 10.0.22631.3374 WSL OS: Ubuntu

benhillis commented 7 months ago

@pmartincic - this sounds a lot like the issue you fixed and we are looking into backporting?

ghost commented 7 months ago

Seems that way from the logs.

724399396 commented 7 months ago

same issue

 start=$[$(date +%s%N)/1000000] && echo "abc" | ./win32yank.exe -i  && end=$[$(date +%s%N)/1000000] && echo `expr $end - $start

consume 10131 ms, aboute 10 s and strace shows it hangs on poll

16:32:28.472562 poll([{fd=0, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN}, {fd=3, events=POLLIN}], 5, -1) = 1 ([{fd=0, revents=POLLHUP}])
16:32:28.472667 read(0, "", 4096)       = 0
16:32:28.472752 shutdown(5, SHUT_WR)    = 0
16:32:28.472869 poll([{fd=-1}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN}, {fd=3, events=POLLIN}], 5, -1) = 1 ([{fd=8, revents=POLLIN}])
16:32:33.545823 recvfrom(8, "\n\0\0\0 \0\0\0", 8, MSG_WAITALL, NULL, NULL) = 8
16:32:33.546064 recvfrom(8, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 24, 0, NULL, NULL) = 24
16:32:33.546253 poll([{fd=-1}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN}, {fd=3, events=POLLIN}], 5, -1) = 1 ([{fd=8, revents=POLLIN}])
16:32:33.588666 recvfrom(8, "\10\0\0\0\f\0\0\0", 8, MSG_WAITALL, NULL, NULL) = 8

does windows firewall or security tool scan this exe then cause a 10 or 5 seconds stop, need help and i found clip.exe work as expect, it consume 48 ms

i try build a do nothing rust exe on wsl2, and run it, it also need 5s to finish run, i think there be a policy to scan cause this problem

new update: found solution, windows defender exclude wsl folder can solve this issue https://github.com/microsoft/WSL/issues/8995

giovannicandido commented 6 months ago

I will try the windows defender exclusion with my antivirus. I think it disable windows defender entirelly.

giovannicandido commented 6 months ago

The exclusion didn't work as expected.

gaving commented 5 months ago

Similarly seeing this exact issue with win32yank, WSL loses the plot and CPU spikes continuously requiring a reboot (wsl --shutdown doesn't even work).

PS C:\gavin\scripts> wsl --version
WSL version: 2.2.4.0
Kernel version: 5.15.153.1-2
WSLg version: 1.0.61
MSRDC version: 1.2.5326
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.26091.1-240325-1447.ge-release
Windows version: 10.0.19045.4291