microsoft / WSL

Issues found on WSL
https://docs.microsoft.com/windows/wsl
MIT License
17.48k stars 822 forks source link

Command hangs when autoMemoryReclaim=gradual with distro systemd enabled #10675

Open AzureZeng opened 1 year ago

AzureZeng commented 1 year ago

Windows Version

Microsoft Windows [Version 10.0.22621.2428]

WSL Version

2.0.5.0

Are you using WSL 1 or WSL 2?

Kernel Version

5.15.133.1-1

Distro Version

Debian 12 + Arch Linux

Other Software

Visual Studio Code (Latest version) with C# extension .NET Core SDK 6.0/7.0 Docker Desktop 4.24.2 (124339)

Repro Steps

  1. Update to latest WSL pre-release version by running wsl --update --pre-release
  2. Enable systemd support in wsl.conf
  3. Set autoMemoryReclaim=gradual in .wslconfig
  4. Restart WSL to apply settings
  5. Do something such as updating packages or writing apps in WSL, or open Docker Desktop, then WSL will allocate a lot of buff/cache memory
  6. After that, place WSL distro command line windows aside for several minutes, to avoid WSL terminating distro automatically, and to allow WSL reclaiming memory
  7. Then, execute arbitrary of commands, such as sudo pacman -Syu sudo apt update systemctl status.

Expected Behavior

This command will be executed instantly.

Actual Behavior

You will find that this command may hangs for a very long time. After kernel re-allocating buff/cache memory, everything works again.

Disabling systemd support also works.

I think it is maybe a mechanism in Linux kernel.

Screenshots:

image

image

Diagnostic Logs

WslLogs-2023-10-25_16-30-50.zip

OneBlue commented 1 year ago

Thank you for reporting this @AzureZeng. Could you capture logs of the entire sequence, and then capture /dumps when the command hangs so we can see what's happening ?

AzureZeng commented 1 year ago

Updated more details about this issue.

ghost commented 1 year ago

What specifically do you mean by ". After kernel allocating buff/cache memory, everything works again." ?

AzureZeng commented 1 year ago

What specifically do you mean by ". After kernel allocating buff/cache memory, everything works again." ?

The buff/cache memory allocated when starting distro is gradually dropped by WSL mechanism (autoMemoryReclaim=gradual). When buff/cache memory is fully dropped, execute any commands in WSL may cause hanging. When hanging, I found that kernel is trying to re-allocate buff/cache memory, and after the full kernel buff/cache memory re-allocation, everything works again.

I think dropping buff/cache memory related to systemd by this memory reclaim mechanism, which is not presented in autoMemoryReclaim=dropcache or manually echo 3 > /proc/sys/vm/drop_caches, is the root cause of this problem.

gallois commented 11 months ago

I was wondering why running apt update was hanging on my laptop and this seems to make sense, as the steps for reproducing aligned with what I experienced.

WSL version: 2.0.14.0 Kernel version: 5.15.133.1-1 Ubuntu 22.04.3

running systemd and autoMemoryReclaim=gradual