microsoft / WSL

Issues found on WSL
https://docs.microsoft.com/windows/wsl
MIT License
17.54k stars 823 forks source link

WSL2 does not regconize 72c/144t (4 sockets server) only 64c #6923

Open pnthai88 opened 3 years ago

pnthai88 commented 3 years ago

Windows Build Number

icrosoft Windows [Version 10.0.19042.964]

WSL Version

Kernel Version

Linux version 5.5.10-microsoft-standard (root@DESKTOP-QBUBFJO) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.12)) #3 SMP Fri May 7 10:55:59 PDT 2021

Distro Version

Ubuntu 20.04

Other Software

No response

Repro Steps

I have a 4 sockets server 4 * E7-8890v3 total of 72 cores and 144 logical processors. The windows OS (enterprise latest version) recognized but WSL2 only accept 64 cores. I did try to rebuild the WSL2 kernel with 8192 NR_CPU... but not working and the .wslconfig also not working.

image

the .wslconfig

image

htop in wsl:

image

zcat /proc/config.gz | grep NR

image

Expected Behavior

Yea, I'm working with docker, I can use it in centos or Linux distro but this is a remote development rack workstation i would love to run it full of powers via windows, not Linux. I'm trying to use all of its resources (72cores 144threads in WSL2) due to hard dev applications.

Actual Behavior

Only 64 cores regconized

Diagnostic Logs

No response

therealkenc commented 3 years ago

Assume this is dupe #5423 (#5472 etc)?

TedFox123 commented 2 years ago

it seems that the problem is unsolved yet after one year~ does anyone have some new solution for this?? (cry ing)

etmoonshade commented 1 year ago

Assume this is dupe #5423 (#5472 etc)?

Based on my read, #5472 is mistakenly marked as a dupe of #5423 though I don't know enough to say this for certain.

5423 talks about a single-CPU system with dense cores (e.g. 64C/128T in a single socket,) whereas this problem and #5472 appear to be about a dual-socket CPU where WSL2 is "stuck" on a single socket. This distinction may have been glossed over in the response, though #5423 seems to have ended up being both.

It may be worth making the distinction between the two issues unless someone can say for certain that the number of sockets don't matter?

it seems that the problem is unsolved yet after one year~ does anyone have some new solution for this?? (cry ing)

Incidentally, I'm having this issue as well, which is why I've been digging around. I've got a two-socket Epyc 7502 system that's exhibiting the exact same behavior - running Server 2022 with Docker on WSL2, I get something that looks like the following under an attempt at full load: image

It's interesting to note, I had a similar issue with Hyper-V (though that one used both sockets but only half the threads,) but I solved that by modifying the scheduler - see https://learn.microsoft.com/en-us/windows-server/virtualization/hyper-v/manage/manage-hyper-v-scheduler-types. Changing it to the "Classic" Scheduler type allowed me to max out my cores under Hyper-V.

twobombs commented 1 year ago

having switched from dual opteron/single TR to dual EPYC on windows recently I also would like to report this caveat. I feel that there is somewhat of a regression here as the dual socket opteron never had this issue; the H11DSi is reporting 8 NUMA nodes, up from 4 NUMA nodes on the H8DGi-F ( dual opteron )

I suspect that this behaviour is a general windows kernel scheduling problem that is visible both inside and outside WSL2 environments

Luxmark4 luxmark4

WSL2 - Qrack Benchmarks wsl2_benchmarks

Will look into bcdedit /set hypervisorschedulertype classic < edit: no joy

Note: under windows 10 this behaviour cuts off at exactly 50% - so the 70% we get in windows 11 is a step forward [?]

hyjforesight commented 2 days ago

Issue persists. Anyone can help?:(

twobombs commented 2 days ago

recently bought a quad system that also has 8 NUMA nodes, experienced the same thing under WSL/Windows in addition to that Ollama came into the mix as well, showcasing the 'solution' to this issue by hardcoded threads forcing the allocation of all those cpu threads in the model and making the windows kernel hang at WSL/Windows so even when we find the solution to this enabling all cores will cause your machine to hang - the reason why this is