Consider compiling WSL2 kernel with -O3 for all archs

microsoft / WSL

Issues found on WSL

https://docs.microsoft.com/windows/wsl

MIT License

17.24k stars 811 forks source link

Consider compiling WSL2 kernel with -O3 for all archs #8684

Open WSLUser opened 2 years ago

WSLUser commented 2 years ago

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] Provides a performance boost to the kernel. Describe the solution you'd like A clear and concise description of what you want to happen. A more performant kernel based on compiling optimizations provided by -O3. Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered. Build the kernel myself with this compilation option. Additional context Add any other context or screenshots about the feature request here. See https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.19-O3-March-Native and this patch series: https://lore.kernel.org/lkml/20220621133526.29662-1-mikoxyzzz@gmail.com/ . Found in https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/init/Kconfig?h=v5.19

Would suggest backporting this patch to WSL2 kernel to benefit from the performance boost.

elsaco commented 2 years ago

Linus disagrees:

Honestly, let's just remove -O3 entirely.

Enabling it, and then not even build-testing the result, is just about
the *worst* possible case. That's just horrible.

The argument that "but ARC uses it" is not an argument. It was always
a bad argument, and ARC needs to just fix whatever it is that made it
an issue (likely already fixed with a compiler upgrade).

And there is no way I would ever accept this as a "let people try it" when

 - as mentioned, just use KCFLAGS=-O3 if you want to

 - -O3 has a *loong* history of generating worse code than -O2

so I will *not* be taking these kinds of patches without some very
serious explanations of why -O3 has suddenly become acceptable again.

Those explanations had better be more than "let people try". They
should have in-depth actual performance numbers for a real load, not
some made-up "bigger is better" logic.

                 Linus

From https://lore.kernel.org/lkml/CA+55aFz2sNBbZyg-_i8_Ldr2e8o9dfvdSfHHuRzVtP2VMAUWPg@mail.gmail.com/

WSLUser commented 2 years ago

So what if Linus agrees or not? He says right there at the end, have performance numbers. Well what's Phoronix? Chopped liver? Overall it is better. And as the patch submitter stated, if there are issues, let them get raised and addressed. I disagree with Linus on this. I think he shouldn't have much say over the kernel anymore. It was his pet project back in the day but it's evolved far beyond him. Even he's admitted he can't keep track or familiar with most of the stuff in the kernel anymore. His day has passed. Of course I'll be happy when they finally move to a platform like github but that'll take all the old maintainers dropping out and allowing the new generation of maintainers to run things.

WSLUser commented 2 years ago

I will mention also that Clear Linux devs also took a look at this and they decided they will be adopting -O3 but for them, no need to backport as they actually keep up with the kernel versions unlike WSL2. They're just waiting until .1 drops to allow some unrelated bugs to get fixed first (before switching to 5.19).

NickDeBeenSAE commented 2 years ago

I like this idea. I'll wait until my Linux Mint VM needs an upgrade.

elsaco commented 2 years ago

Michael ends the Phoronix test with When it came to the -O3 kernel build for other workloads like gaming/graphics, web browsing performance, and various creator workloads there was no measurable benefit from the -O3 kernel. Considering all the bugs and regressions this optimization might cause it's absolutely not worth it. It also makes the resulting code bloated. Personally, I'll stick with -O2 on WSL and my real Linux workstation.

iavael commented 2 years ago

@elsaco phoronix benchmarks mentioned in OP were actually a response to Linus's proposition and specifically to this part of it

They should have in-depth actual performance numbers for a real load, not some made-up "bigger is better" logic.

Considering all the bugs and regressions this optimization might cause it's absolutely not worth it.

Are these workloads typical to WSL?