Open stenzek opened 5 months ago
I think dropping support for SSE4 is a net positive.
Not being able to support the code you write is the nail in the coffin. I'm glad the PCRE2 JIT issue was worked around but it wasn't truly solved, and you can't reasonably be expected to tackle it in depth.
You also can't be reasonably expected to write your best code when dealing with cruft and crap and conk. GS code is already hard enough to write so anything that can be done to make that easier is good.
I guess to counter this (devils advocate if you will) there have been a few people in support with sse4 only due to older amd chips, especially in regions like Brazil where even a haswell will set you back life $300 usd just for the cpu (seriously the region is fucked).
Further to this, at least currently we're not in a position where sse4 is a huge burden imo. Though it may do at some point, so we will have to drop it eventually. Since we have multi isa we could split out the sse4 code in to "legacy" and let it rot for a bit if you don't want to have to maintain it, the vus and EE work mostly well right now, avx2 could make them faster, but that's about it, so I don't see it as much of a support problem in that regard
That's why I opened this thread to begin with - it's easy to presume that there are lots of folks out there with ancient hardware, but is it actually the case?
Like I said in the OP, an alternative could be mandating AVX, not AVX2, which would enable better codegen for the recs and GS.
Yeah I understand. Globally? Probably less than 5% but in regions with poor economies, probably quite high.
What do the steam charts say?
My previous CPU that i stopped using just few months ago, G4560, ran 90% of games without an issue, even upscaling at 2x, even without MTVU.
That CPU didn't have AVX or AVX2 support, just SSE4, according to Intel specs. I think there are still plenty of people that use those kind of processors and dropping SSE4 right now it's a bit too soon.
ran 90% of games without an issue, even upscaling at 2x, even without MTVU
I very much doubt that. Popular games like GoW2 would be a struggle at Haswell-level IPC without MTVU, I know this because my previous Kaby Lake laptop struggled with them.
It frustrates me when users say "90% of games work", when the PS2 had over 4,000 titles, I very much doubt you have tested over 3,500 of them. Such percentages are entirely made-up, and meaningless.
It's 90% of the games i own. They are around 20 games and almost everyone ran well with 2x upscaling. There was Metal Gear Solid 3 that was struggling to reach full speed at 2x in cutscenes, Need for Speed Underground 2 had some framerate drops, but that's about it.
Every other game ran fine. I should add that MTVU actually helped improve the speed, even if it was disabled by default, enabling it i had speed improvements.
Well i'm on ivy bridge CPU so i would be screwed over if AVX2 was the minimum but i see the reason so i guess i'm on team "make AVX the minimum instead of AVX2"
I think dropping support for SSE4 and requiring AVX2 is a net positive. , if you dont have a cpu that dosent have AVX2 in the modern day you would have more issues than just PCSX2 ( a good number of games and programs mandate AVX2) and the fact AVX2 has been around for more than a decade , apart from specific low end modern intel cpus , every modern CPU has AVX2
Anything till (including Comet Lake) Pentiums/Celerons don't have AVX support so that's still fairly recent release.
Feels like too soon to drop it.
I saw mentioned on discord that Rosetta doesn't support AVX, so SSE4 support will have to be kept until at least we have full Apple Silicon Support.
WoA's equivalent x64 emulation also doesn't support AVX, we would need to determine if anybody is actually using this with PCSX2, or get WoA builds running
I saw mentioned on discord that Rosetta doesn't support AVX, so SSE4 support will have to be kept until at least we have full Apple Silicon Support.
Just to add to this point, Rosetta has avx/avx2 support on the new macOS 15 beta. But yes, for Mac in general this isn’t such a big issue.
It’s a positive. Those still relying on sse4 will just have to deal with 2.0’s 99% compatibility with all the fixes from now til then. And I do say go full avx2. Make it just one painful tear rather than potentially 2 and be done with it.
Anything till (including Comet Lake) Pentiums/Celerons don't have AVX support so that's still fairly recent release.
They're also dual cores, and as I stated in the OP, are going to struggle with no MTVU in many titles.
I saw mentioned on discord that Rosetta doesn't support AVX, so SSE4 support will have to be kept until at least we have full Apple Silicon Support.
In terms of priority, merging the full Apple Silicon dynarecs would happen before the AVX requirement, so it's a non-issue.
WoA's equivalent x64 emulation also doesn't support AVX, we would need to determine if anybody is actually using this with PCSX2, or get WoA builds running
They work fine already (with the dynarecs added, obviously). Who knows what driver shenanigans exist, since I can only test VMware, but that's an issue irrespective of ISA.
Also. How many of these non avx2 systems meet the single threaded performance requirements ? I have a feeling that most all of them are laptops or aio that have sub laptop cpus that were on low end hardware years ago.
Another also. Microsoft is dropping windows 10 support in a year which will make windows 11’s hardware requirements the new minimum. How many non avx2 cpus does windows 11 automatically drop? This one change won’t only dump the majority of non avx2 systems but it will also drop a bunch of early avx2 systems.
Now how many avx1 cpus will survive the windows 11 purge and meet the single threaded performance requirement?
I think this (giant) dev cycle should be wrapped up before making a change like this. We will have people come in for support with SSE4 only chips and it'd make it a lot easier for both parties if we didn't have to refer them to a random devbuild like v1.7.5912 when they come asking for the last SSE4 supported build.
That’s what Stenzek is saying. Make 2.0 sse4’s last hurrah. But we’ve dumped various apis and support for old versions of windows, and the plugin system, and the old interface mid cycle with nothing but a random build number left to them.
Also forgot how we unceremoniously axed sse2, ssse3, and 32bit mid cycle.
To get everyone entirely on the same page, I've done some research. There are a total of 33 CPUs I could find in existence which could be affected by dropping SSE4 support but retaining AVX (this does not account for switching to AVX2 and also assumes the CPU must be over a PassMark STR of 1500 to qualify).
To my understanding, no AMD CPU exists which has SSE4 but which lacks support for AVX.
There are no Zhaoxin CPUs that have only SSE4 but which even approximately meet the 1500 mark (see, e.g., this one with AVX at 550 STR). Thus, this company's CPUs should not be taken into consideration despite them also being x86-64.
My understanding is that there are exactly five desktop Gulftown CPUs with SSE4 and without AVX which at base clock (barely) fall above the 1500 mark that we cite in our specs:
Any lower-numbered Xeon (Xeon W3670 is 1498, so within margin of error technically) or lower-numbered i7 (i7-970 is 1467) does not meet 1500 STR.
Intel introduced AVX2 into Celeron processors during Alder Lake ("Golden Cove", 2021). So anything during or after that does not count for this discussion.
For Celerons, as they lack hyperthreading entirely, I will not be looking at CPUs with 2 cores or fewer. They are likely unsuitable for PCSX2 despite the STR, as even on a lightweight Linux distro, they would almost assuredly struggle. The most recent Celeron processor which would get "snapped" by such an update is the Celeron N5095, released in 2021 (STR of 1503). However, this is in fact the only Celeron desktop CPU which has four cores, an STR above 1500, SSE4 support, and lacks AVX support.
Intel introduced AVX2 into Pentium processors during Alder Lake ("Golden Cove", 2021). So anything during or after that does not count for this discussion.
The first generation of Intel Pentium processors that begin to exceed 1500 STR and which are dual-core hyperthreaded is Kaby Lake-S, released 2016 (G4XXX). The standard power ones all exceed 2000 STR, while the low-power ones all exceed 1500. The same is true for Coffee Lake-S (G5XXX). Finally, the same is true for Comet Lake-S (G6XXX). All of these are dual-core with hyperthreading, so no true quad-core CPUs. There are exactly 26 of these.
As with desktops, there is exactly one laptop Celeron which has four cores and meets all the other criteria, namely the Celeron N5105 with a score of 1510.
I could find no mobile Pentium processor with 4 or more threads, a PassMark score of 1500 or more, having SSE4 and lacking AVX.
There are 26 Pentium CPUs which would be affected by this, 5 Gulftown CPUs, and 1 Celeron, for a total of 32 desktop processors in existence I can find which would be truly affected by this change. Of those, 84% are Pentium or Celeron CPUs. All of the Gulftown CPUs are from 2010 and 2011, and all of the rest are between the years 2016 and 2019 (inclusive).
There is exactly one laptop processor which could be affected by this, being the Celeron N5105.
In light of my (now-revised) comment above, my vote is to axe SSE4 immediately following the next major release, as the range of CPUs affected is extremely limited, and it steals what is essentially "free" performance for the overwhelming majority of users whose CPUs are powerful enough to properly run PCSX2 while being an undue burden for maintenance.
In light of realizing how much has been unceremoniously axed just in the 1.7 cycle I’ve got to ask why ss4 is so holy it has to wait until after 2.0?
To get 1.6 out of the way. And not that much has been axed, SSE 2 and windows 7/8 support, which has been dropped by pretty much every major corporation, so it's not "unceremoniously".
By unceremonious I mean that it wasn’t given a special version number. We don’t even have links to the last builds that contained these last bits of support for old cpus, apis, interfaces, and os versions. So I’m asking why is sse4 different from all the other old crap that was dumped?
Because the SSE4 code makes up most of the recompilers, it's a lot of work and changing of stuff, it's not the case of us just changing one thing which happens to break support.
Okay. I was under the impression that it was mostly just affected the software renderer
no, every JIT uses it.
Steam stats, for example
Only about 6% would be affected, and they could always stick to an older build.
well steam stats are skewed a little, we have a huge amount of users in brazil and indonesia, the former being a pretty poor area, but they love this stuff.
It’s not like 2.0 will be unusable for those regions. Pcsx2 is right now extremely good, it’ll be better when 2.0 is released whether that is next week or next year.
compared to 1.6 it's leaps and bounce, hence wanting to get the next "stable" version out before we gut it.
What about an opt-in hardware survey to settle this? Let it run for a few months, then decide what to do after gathering the results.
opt-in survey over our consumer base would probably be too selective, plus it means bothering users with a hardware survey
I'm definitely with Stenzek on this one. I think 1.7 is near to perfection, so people with SSE4 are barely losing anything. Plus, more performance is always good. Can't set yourself on fire to keep others warm :P
As a user whose CPU lacks the AVX2 instruction, my vote goes to the "ditch SSE4 and set AVX as the minimum" camp.
About Brazil, things are not AS bad as refraction said, but electronics here are indeed far more expensive than in most other places.
A Ryzen 7900x here is costing the equivalent to $479, while on Newegg it is $375.
The biggest problem is the minimum wage/buying power, you need 1.82 x minimum wages to buy the 7900x in Brazil, in the US you would only need 0.3 minimum wages, so the buying power in US is 6.1 times higher than here.
About the topic, I'm in favor of keeping support for AVX and beyond. When I had a i7 980x clocked at 4.4 GHz, I could run almost anything on PCSX2, so people in this category are indeed going to suffer, but they are the minority anyway.
I mean, that confirms what I was saying, it's not just about the raw cost difference, it's the vs cost of living as well.
saying it's not that bad because it's only 100 more, but your wages are like 10% of what the US gets, that's a big freaking difference.
I commented because you said a Haswell could cost $300 here, which is not accurate. But yeah, the situation is indeed bad.
My apologies, I misremembered the conversation I had, the person I was discussing it with back in February said A ryzen 3200G in here it's 150 USD
which is still daylight robbery, even when your economy isn't screwed.
On China's second-hand trading apps, a i3 Haswell supporting the AVX2 costs only ¥ 45.5, equivalent to 6 USD.Including the motherboard, it will not cost more than 20 US dollars. Just for reference.
Just want to drop here that I currently daily drive an i5-2500, which along with (as far as I can see) the majority of other processors in it's socket (Sandy Bridge was when AVX was even introduced), supports SSE4 and AVX, but not AVX2.
It is more than powerful enough to handle PCSX2 regardless of this, and I wouldn't be surprised to see that a lot of CPUs in the same camp would be affected by the requirement of AVX2.
Now, me personally, I don't see any benefit to keeping SSE4 around. But as for the AVX2 mandation, that's going to significantly kneecap a lot of people.
The window from SSE4's release to AVX's is 1 processor generation, and to be blunt, I don't think anyone with a Penryn/Nehalem processor is going to be able to emulate in a capacity that should be considered supported. So, if it would clean up the code, and (possibly) increase speed, I would say dropping SSE4 in favor of an AVX minimum requirement is a fantastic call.
Optional AVX2 for processors that support it would be a good idea if it brings significant benefit over AVX itself, too, but I've never worked with extensions like these so I have no idea if that even makes sense.
tl;dr from my view is kill SSE4, mandate AVX, AVX2 optionally would be nice.
Intel CPUs Supporting SSE4 (max):
Intel Core 2 (Penryn series) - These processors introduced SSE4.1.
Intel Core i7, i5, i3 (Nehalem, Westmere) - These processors support both SSE4.1 and SSE4.2.
AMD CPUs Supporting SSE4:
AMD FX-Series (Bulldozer, Piledriver) - Support both SSE4.1 and SSE4.2 but also include SSE4a and AVX(1). Since you're interested in avoiding SSE4a, these wouldn't fit.
AMD Phenom II - Does not support SSE4.1 and SSE4.2 but supports SSE4a. (So this one already doesn't count, cuz a type is lower importance)
AVX minimum:
Intel Core i7-2600 (Sandy Bridge)
Intel Core i5-2500 (Sandy Bridge)
Intel Core i3-2100 (Sandy Bridge)
Intel Core i7-3770 (Ivy Bridge)
Intel Core i5-3570 (Ivy Bridge)
Intel Core i3-3220 (Ivy Bridge)
AVX2 list (incomplete because most modern CPUs, have it, so listing the older types):
Intel CPUs with AVX2:
Haswell Microarchitecture (2013):
Intel Core i7-4xxx series (e.g., i7-4770K)
Intel Core i5-4xxx series (e.g., i5-4670K)
Intel Core i3-4xxx series (e.g., i3-4150)
Intel Xeon E3 v3 series (e.g., E3-1230 v3)
Broadwell Microarchitecture (2014-2015):
Intel Core i7-5xxx series (e.g., i7-5775C)
Intel Core i5-5xxx series (e.g., i5-5675C)
Intel Core i3-5xxx series (e.g., i3-5157U)
Intel Xeon E3 v4 series (e.g., E3-1285 v4)
Skylake Microarchitecture (2015-2016):
Intel Core i7-6xxx series (e.g., i7-6700K)
Intel Core i5-6xxx series (e.g., i5-6600K)
Intel Core i3-6xxx series (e.g., i3-6100)
Intel Xeon E3 v5 series (e.g., E3-1270 v5)
Kaby Lake Microarchitecture (2016-2017):
Intel Core i7-7xxx series (e.g., i7-7700K)
Intel Core i5-7xxx series (e.g., i5-7600K)
Intel Core i3-7xxx series (e.g., i3-7100)
Intel Xeon E3 v6 series (e.g., E3-1275 v6)
AMD CPUs with AVX2:
Excavator Microarchitecture (2015):
AMD FX-8800P series
AMD A10-7890K series
AMD Athlon X4 845 series
Zen Microarchitecture (2017):
AMD Ryzen 7 1xxx series (e.g., Ryzen 7 1700)
AMD Ryzen 5 1xxx series (e.g., Ryzen 5 1600)
AMD Ryzen 3 1xxx series (e.g., Ryzen 3 1200)
AMD Ryzen Threadripper 19xx series (e.g., Threadripper 1950X)
AMD EPYC 7000 series (e.g., EPYC 7601)
So basically if you want AVX2 bare minimum:
Intel:
Haswell (4th Gen) and newer (Broadwell, Skylake, Kaby Lake, etc.)
Xeon E3 v3 and newer
AMD:
Excavator and newer (FX-8800P, A10-7890K, etc.)
Ryzen series (Zen architecture) and newer
Hopefully above is correct.
More importantly how many people are affected by this change? Hard to say without real telemetry (yeah I know it has negative connotation) and as said before a poll might be too narrow range of people.
Personally on one side getting rid of SSE4 reduces not only code debt and less code to maintain and future benefits for speed and accuracy. On another side it's probably still a sizeable portion of people in areas and countries where they can't easily get better hardware and the specs requirement has even lowered compared to any other version be it stable or nightly from the last 10 years (and no DX9 renderer doesn't count when it doesn't render massive portions of stuff).
It won't personally affect me, even my old gaming laptop which was 960m + 4210H has AVX2. Compared to SSE2 dropping this is a bit harder decision. Thanks for listening and weighing the options. Keep in mind that next stable will still have it so it's not like SSE4 users will be tossed in lava.
Try a poll on the brazilian subchannel of PCSX2 on discord, that would give a nice picture of the situation here.
My apologies, I misremembered the conversation I had, the person I was discussing it with back in February said
A ryzen 3200G in here it's 150 USD
which is still daylight robbery, even when your economy isn't screwed.
Got it. I've just done a research here and 3200G is costing $85.6. The person you talked to either exaggerated or the price came down drastically in the last months. I honestly though this CPU wasn't even sold anymore (new in box, that is).
Try a poll on the brazilian subchannel of PCSX2 on discord, that would give a nice picture of the situation here.
My apologies, I misremembered the conversation I had, the person I was discussing it with back in February said
A ryzen 3200G in here it's 150 USD
which is still daylight robbery, even when your economy isn't screwed.Got it. I've just done a research here and 3200G is costing $85.6. The person you talked to either exaggerated or the price came down drastically in the last months. I honestly though this CPU wasn't even sold anymore (new in box, that is).
that would be a horrible self selecting poll of a sub sample of people already probably against it.
Well, since this is meant to be a discussion, as a regular PCSX2 enjoyer myself, I'm also in favor of dropping SSE4 support in particular, because nowadays it would only be fairly old hardware where dropping support for it is an issue, and said hardware likely struggles to run PCSX2 adequately in the first place.
Matter of fact, I've even upgraded last year, from my previous 6th gen i5 coupled with a GTX 950, not to the latest and shiniest thing on the planet, but to a Ryzen 5 5600X coupled with a 6700XT, mainly for W11 support (but, in a surprising twist of fate, if all goes well, I now plan on ditching Windows in favor to Linux in the not so distant future following Microsoft's more than questionable choices it's made for Windows, for good. Thank God I don't depend on Windows software, and thank Valve for Proton. Good riddance).
I generally favor support for older hardware, provided that it is still being officially supported by the current operating systems, and as long as supporting it does not enforce compromises for more recent hardware that is more widespread and relevant, in addition to not blocking further optimizations and enhancements for software.
As far as I can see, now PCSX2 has reached the point where supporting something like SSE4 is actively blocking progress, so in that case, it probably makes the most sense to drop it, but not to also enforce AVX2 immediately. Mandating AVX might make more sense initially, and only then, if there's no significant backlash or a noticeable shrinking of PCSX2's user base, mandate AVX2 support.
The main benefit of moving to AVX/AVX2 is slightly better performance. But it seems like PCSX2's performance is already good enough. Ironically, the main group of people who would benefit from increased performance are those whose CPUs are too old to support AVX. So to me, the tradeoff of dropping SSE4 doesn't seem worth it.
So to me, the tradeoff of dropping SSE4 doesn't seem worth it.
It's not just performance, it's maintainability too, and the fact that we can't test something we don't have. It doesn't make sense to hurt maintainability for like 2% of CPUs, that aren't fast enough to run moderate games to begin with.
As it stands, based on what I've seen so far, it's looking like making AVX (not AVX2) the minimum is going to be the path going forward.
The main benefit of moving to AVX/AVX2 is slightly better performance. But it seems like PCSX2's performance is already good enough. Ironically, the main group of people who would benefit from increased performance are those whose CPUs are too old to support AVX. So to me, the tradeoff of dropping SSE4 doesn't seem worth it.
What is the meaning of simulation? Use modern equipment to simulate an ancient device. In this process, the coders must be happy. Their behavior is selfless. As a free user who benefits from it, we cannot use the substandard old hardware as an excuse to force coders to do things they don't want to do. As for users with old hardware, as you said, the existing pcsx2 is already good enough, and they can use the old version to meet their needs until they replace the modern hardware.
So to me, the tradeoff of dropping SSE4 doesn't seem worth it.
It's not just performance, it's maintainability too, and the fact that we can't test something we don't have. It doesn't make sense to hurt maintainability for like 2% of CPUs, that aren't fast enough to run moderate games to begin with.
As it stands, based on what I've seen so far, it's looking like making AVX (not AVX2) the minimum is going to be the path going forward.
While think avx2 is better way to go. No matter what choice is made it’ll be a decision that will be less and less controversial everyday that goes by as sse4.1 cpus will become less and less of proportion of cpus
Dropping SSE4 after releasing an stable 2.0 release and requiring AVX as minimum going forward seems reasonable enough, even my old AMD FX-8350 supports AVX.
Maybe the users that are still on SSE4-only CPUs should know where the stable 2.0 release window is set.
So to me, the tradeoff of dropping SSE4 doesn't seem worth it.
It's not just performance, it's maintainability too, and the fact that we can't test something we don't have. It doesn't make sense to hurt maintainability for like 2% of CPUs, that aren't fast enough to run moderate games to begin with.
As it stands, based on what I've seen so far, it's looking like making AVX (not AVX2) the minimum is going to be the path going forward.
This, exactly. I'm speaking from a programmer's standpoint (though not one particularly experienced with emulation) and maintainability is absolutely critical when you're working with projects like these.
CPUs without AVX are few and far between these days, but that's not the problem how I see it; the problem is that those without AVX are for the most part incapable of running PCSX2 anyways.
That comboed with the fact that nobody in their right mind would be using, let alone working on the emulator with hardware that's from before 2011 makes it essentially a no brainer to drop it in favor of devoting time to a faster and more testable and maintainable solution.
If SSE4 is retained, it's most certainly going to rot to the point of being broken due to nobody being able to test it nor actively work on it. That alone is reason enough to phase it out in favor of AVX, which unlike SSE4, actually has PLENTY of CPUs on it that can actually run PCSX2, and as well as that, are common enough nowadays to the point where it's not a big deal to get a hold of if it's absolutely needed to obtain a machine with AVX but not AVX2. You probably have one somewhere lying around if you've got a laptop or something from before 2013, or in the early-mid end of it.
From what I've seen of AVX2 as well, thanks to just being an extension to AVX rather than being a completely different thing like SSE4 is to AVX, it would be theoretically possible that both could be maintained relatively in direct tandem, bringing speed improvements on what can use AVX2 while still supporting and having the same code for the most part on AVX. Everyone wins here.
Ditching SSE4 for AVX prepares for the future, makes the code easier to maintain in the future, while still keeping the older machines that have any shot at actually running PCSX2 (like my own) able to do so. I think that's a pretty neat idea.
I've created this as an issue instead of a discussion for visibility.
Based on internal discussion, some of us think it's time to drop SSE4 support from PCSX2. Reasons being:
I can't really think of many upsides to keeping it around. For Windows, I'd be surprised if many people are running Win10 on hardware that is both fast enough for PCSX2, and does not support AVX2, same for Linux. On the Mac side, our minimum is 11.0, which rules out anything without AVX2.
If you still think we should maintain SSE4 going forward, this is the place to make your case. FWIW, I'm not proposing dropping it yet, at the very least, it would be after the next release version is tagged, so there will always be that version to fall back to.
PS: This is less about 256-bit vectors, and more about the additional instructions added with AVX2, 3-arg form, etc. 256-bit vectors provide zero benefit for the majority of code, and NEON is only 128-bit for Apple Silicon anyway. Thus, an alternative could be maintaining multi-ISA support, but making AVX the minimum, instead of AVX2. That would keep compatibility with Intel Sandy/Ivy bridge, and AMD cat cores.