Open mborgerson opened 1 year ago
When it comes to supported instruction sets:
Bumping up the minimum instruction set from SSE2 to 3 would lose virtually zero users (but the gains would likely be small), SSE4.2 has been around for almost 15 years now and AVX2 10 so it is a good idea to introduce these optimisations. But if it is of little trouble then keeping a lower end .exe would be best.
When it comes to supported instruction sets:
- The very first iterations of AMD Athlon 64 have SSE2 as their maximum
- Later AMD Athlon 64 and All Intel 64 Capable CPUS support up to SSE3
- First iteration of Core 2 Duo and Core 2 Quad introduced SSSE3
- Second Iteration of Core 2 Duo and Core 2 Quad introduced SSE4.1
- All Intel Core ix and AMD FX and newer CPUs support SSE4.2
- Intel Sandy Bridge (Core ix 2xxx series) and AMD FX (Bulldozer) introduced AVX
- AVX2 introduced with Intel Haswell (Core ix 4xxx series) and final iteration of AMD FX (Piledriver architecture)
Bumping up the minimum instruction set from SSE2 to 3 would lose virtually zero users (but the gains would likely be small), SSE4.2 has been around for almost 15 years now and AVX2 10 so it is a good idea to introduce these optimisations. But if it is of little trouble then keeping a lower end .exe would be best.
In this case compiling using x86-64-v3 and adding avx512f optimization and avx2 optimization would cover everything we would want since CPUs that don’t have avx2 won’t be able to run xemu anyways that well and if you want slightly lower can get avx.
One important caveat is that while AVX was introduced with Sandy Bridge (Core ix 2xxx) and AVX2 with Haswell (Core ix 4xxx) processors, the lower end Celeron and Pentium models had no AVX/2 support until Alder Lake (Core ix 12xxx). However XEMU would likely remain playable on those CPUs even without AVX due to SSE4.2 support and all of the other improvements successive generations brought, and locking them out entirely would not be a user friendly move in my opinion.
I did my own testing and compiling avx512f and avx2 in steamdeck which doesn't have avx512 it gets illegal instruction error. Other emulators like rpcs3 can toggle it on and off in app so I wonder if somthing could be similar and just compile to x86-64-v3 which goes up to avx2. But like what person above says that until alderlake came lower end CPUs didn't have avx2. Is there some way to have All these options. From testing xemu compiling with these I do get like 4 fps increase on most stuff even with enabling optimization on avx/avx2.
SSE4.2 should be a pretty good baseline as it's also available on older Celerons/Pentiums, which don't have AVX let alone AVX2 (these were disabled for market segmentation reasons).
If your CPU doesn't support SSE4.2, it's unlikely to be able to run any game in xemu at fullspeed. (Even on CPUs that support SSE4.2 but not AVX, this is a tall order already – but I can see some 2D games running acceptably in that case.)
Is there some way to have All these options
Dynamic dispatch makes it possible to use modern instruction sets on demand, but it can't be used for compiler autovectorization (which is what this issue is about). You can only pick one set of instructions per binary[^1], and it's a hard requirement.
Therefore, if you wanted to optionally support newer instruction sets, you'd have to distribute 2 binaries (e.g. SSE4.2 and AVX2). PCSX2 did this a while ago but recently stopped, likely due to the maintenance cost and longer build times.
[^1]: You can set it on a per-file basis when compiling, but the final binary will have the requirements of the most demanding instruction set used during compilation.
Feature Request
gcc -Q --help=target
. Looks like for x86-64, SSE2 is enabled by default. We probably want to enable SSE4.2/AVX2/etc.Alternatives
No response
Additional Context
No response