microsoft / STL

MSVC's implementation of the C++ Standard Library.
Other
10.07k stars 1.48k forks source link

Build the x86 STL with `/arch:SSE2` instead of `/arch:IA32` #4741

Closed StephanTLavavej closed 3 months ago

StephanTLavavej commented 3 months ago

Fixes #3118. Fixes #3922. See /arch (x86) on Microsoft Learn.

On x86, the STL (and indeed the entire VCRedist) was historically built with /arch:IA32 because it had to be capable of running on ancient OSes and potato chips. Now, Win7 / Server 2008 R2 are unsupported and no longer receiving security updates - but before they reached end-of-life, even they were patched to require SSE2. That was KB4103718 in May 2018, over 6 years ago. (Note: I am well aware of the single exception that paid security updates for the highly obscure Windows Embedded POSReady 7 will end in Oct 2024. More on that in another PR, but the point here is that even Windows 7 requires SSE2 now.)

The STL can now begin assuming unconditional support for SSE2. The compiler now defaults to /arch:SSE2, so all we need to do is remove /arch:IA32.

Why make this change? It slightly simplifies our build system and may slightly improve performance (although I don't expect it to be observable, so the PR label is honorary). It also means that our separately compiled code will be exercising the same compiler codepaths used by the vast majority of x86 builds everywhere. If the status quo were reversed and we were currently building with /arch:SSE2, we would never want to change to /arch:IA32.

This affects the VCRedist, but (1) VS 2022 17.12 will be an "unlocked" long-term support release, and (2) there are no coordinated header changes, so we don't need to worry here.

Note that although we can now assume that SSE2 is unconditionally available (as it has always been for x64), we aren't taking advantage of that in manually vectorized algorithms. See #4536 - attempting to maintain distinct codepaths for SSE2 and SSE4.2 was extremely difficult and we no longer take that risk.

We can also drop test coverage that disables SSE2 (in a partial, simulated way), because we'll never run on such processors.

Finally, I don't think we need to bother testing GH_000935_complex_numerical_accuracy with /arch:IA32. The STL's headers aren't blocking the option, so while users must be running SSE2-capable processors, they can still limit their own codegen to IA32 (unless and until the compiler deprecates and removes the option, of which I am aware of no plans). However, I think this option is sufficiently obscure that we don't need to bother testing it, and we haven't had any bugs involving it either.