Closed mirh closed 3 years ago
There is a way to dump blending setup
https://github.com/PCSX2/pcsx2/blob/master/plugins/GSdx/GSRendererOGL.cpp#L675
Replace this line with if 1
Normally tracing can be enabled with the debug_opengl = 1
on dev/dbg build. As the bonus it will validate openGL functions call.
However, I'm not sure tracing work on windows, @FlatOutPS2 @ssakash @turtleli did one of you manage to make openGL tracing usable on windows?
I fixed it last year (cbd2417833104d3de42f3ca69ce038a1ffc52fb6), I haven't used the tracing stuff since January/February though so I'm not aware of the current state (I assume it still works).
@mirh You can use BlueScreenView to get basic info about crash like bugcheck code etc.
There is a similar issue with GT4. HW OpenGL , Blending Unit Accuracy set to none.
I didn't get a bsod but the display driver stops working.
There's all kinds of artifacts on the screen. The screen flickers and the display driver is restarted. After the display driver restarts pcsx2 doesn't respond and it needs to be closed from processes. GPU load stays at 100% even after closing pcsx2. The only solution to fix the gpu load is a system restart. An event log is written "Display driver amdkmdap stopped responding and has successfully recovered."
@lightningterror This seems to be exactly as my problem with Star Ocean First Departure and some other games in PPSSPP, maybe both triggers same code in AMD driver.
EDIT: Can you check this by unpacking this file to PCSX2 directory and testing again ? (this is old AMD OpenGL driver from 15.7 driver package) http://www36.zippyshare.com/v/0NmGmAOF/file.html
So they fixed the SSO/dual blending issue. Spent 6 months of tests. Finally release it, and boom first test explodes the computer.
Did someone open a report to AMD. Saying that using the dual blending unit crash the whole systems.
@mirh By the way the commit that you found just disable accurate blending on some equations. Initially
Cs*As + Cd
, Cs * F + Cd
, Cd - Cs*As
, Cd - Cs*F
were always partially done in software. I.e the multiplication was done in SW and the addition/subtraction in HW.
For example Cs*As + Cd
.
Before shader output Cs*As
and blending unit was set to Cfrag + Cd
Now shader output Cs
and blending unit was set to Cfag * Afrag + Cd
(note Afrag comes from the 2nd source).
Did someone open a report to AMD. Saying that using the dual blending unit crash the whole systems.
I first wanted to have a trace they could use to reproduce the issue before opening a report, but I have no time atm.
Might not be easy to have a trace. Did you try to replay your gs dump ?
Anyway, they will need 6 months to release a fix (potentially it is already fixed......).
Did you try to replay your gs dump ?
I don't know how I can do it :p
@gregory38 This crash seem not to be related with SSO at all, it's something else because PPSSPP does not use SSO and crashes in same way.
I agree with you the bug is related to dual-souce blending. However they fix their codes to support dual-source blending with SSO. There is a huge probability that they introduce another bug/regression in the meantime.
Actually, with the bisected commit of mirh, you can be sure the issue is dual-source blending. Because the commit replaces some blending operation with single-source blending (old code) by dual-source blending (new code) when you disable accurate blending. The goal was to reduce the load on the GPU.
If only I could have a testcase.. š Does the "gs dump player" (whatever it is and whatever it works) require BIOS? EDIT: uh, MFW I find tools/GSDumpGUI folder. Inb4 I'll be the first guy happy for a BSOD.
Player is only an exe that load GSdx.so file. So no bios. Technically the gs dump contains game textures & vertex. But I think a couple of frames can be seen as a fair-use. Honestly, I'm not even sure you need a testcase, you can reports that several projects are broken. Maybe someone will be clever enough to detect that test quality on dual source is bad.
@FlatOutPS2 how do you replay on windows ? How do you update the ini option, is it possible actually ?
Yes, yes, I just tested it. I guess it it will be quite fine.
Honestly, I'm not even sure you need a testcase, you can reports that several projects are broken
It's just I was thinking that if we needed 7 months for something with sources and all, a dumb "closed" test would have been even less useful.
I can't really open the thread to check the report. Says access is restricted. Guess I can't see the staff comments on this.
They still have to approve it prolly.
hey i was testing it with looney tunes space race and , butin my case with a R9 290X and driver version 16.7.3 beta and i don't have this bug
Try my testcase, then report back. Of course not every game performs the same calls.
@mirh I tried it and the issues are the same as with GT4. Now we wait 6-7 months for a fix.
No 6-7 month delay is only to deliver the fix. They first need to find the bug and then a solution. At least you can use accurate blending to reduce the crash likelyhood.
OT, but w/e: just for the records, since a month CodeXL support cross-platform frame analysis (aka see which functions are spending the most CPU or GPU time)
@mirh Tested your bsod package you posted on AMD site - TDR looks same like in PPSSPP, so this seems to be same issue.
"I can confirm we determined this to be a driver issue. Our GL driver team is now working on a fix."
At least they are working on a fix.
Amd fixed this. It will be available in the newest drivers.
Amd fixed this. It will be available in the newest drivers.
Great, now all we have to do is wait 3 months. :p
Great, now all we have to do is wait 3 months. :p
@dwitczak from AMD:
We have fixed this issue internally. The bug should no longer reproduce in the next driver release, or the one that follows.
Seems so, or even longer...
Technically, you are at least sure that it will be integrated in the last release of an AMD driver (because none will follow) :stuck_out_tongue: The one that can guess the release version that will include the fix get the privilege to report next issue ;)
As I said in the Vulkan rpcs3 issue, they said _the_ next release. Not _a_ future release. Or perhaps i'm just overanalyzing I dunno.
As I said in the Vulkan rpcs3 issue, they said the next release. Not a future release. Or perhaps i'm just overanalyzing I dunno.
The quote is "The bug _should_ no longer reproduce in the next driver release, _or_ the one that follows.". Those two highlighted words give some room for it to be postponed.
Next release or that one that follows, kay. Which is like a week or two.
That _should_ on the other hand may just express "courtesy" then.
So right now they have 16.9.2 as the official release and 16.11.3 as the less official one (non-whql). What would "next" or "the one which follows" be? 16.11.4? 17.x?
Any one should just count I believe. Also, I think there's no distinction between official and beta release, as long as build number increases.
trivia: non-whql isn't a thing anymore if you want your driver to work with W10 anniversary.
So right now they have 16.9.2 as the official release and 16.11.3 as the less official one (non-whql). What would "next" or "the one which follows" be? 16.11.4? 17.x?
They don't count the hotfixes as new driver releases. Unless they streamlined the process very recently, it'll take a couple of months before we see this fix. We just need to hope they don't introduce another bug in the meantime...
DX12 POPCNT fix was added in 16.10.2 hotfix so they can add OpenGL fix in hotfix too. I think GL_ARB_separate_shader_objects fix also was added at first in hotfix.
They can add all kinds of fixes to hotfix releases, but when they say the next driver release, they don't mean it will be included in the next hotfix release.
So instead of what not, can anyone say which version they mean by "next"?
It's a surprise for everybody š
In that case, my guess is that "next" refers to 17.x.x and "the one which follows" would be 18..x.x . How many years did it take them to move from 15 to 16?
Thiers driver names are the date, ie 16.11 mean November 2016, so 18.xx will be in 2018.
I think GL_ARB_separate_shader_objects fix also was added at first in hotfix.
Well, 6 months for a hot fix, it is more than hot ;)
But for a kernel crash they will likely release it faster. So let's wait a month.
In this year AMD have much more driver releases, even 4 beta releases per month. In past year there was (mostly) one beta release per month + stable per 3 months.
Better than damn Nvidia; I've been stuck on v372.70 cause they can't be bothered to do basic stability testing for their drivers now.
New driver came out, no fix. :(
I guess we must wait 6 month to be implemented :/ But If I were to guess I'd say the next one should be the one.
Maybe someone should ask them what they mean when they say "next one"?
Better than damn Nvidia; I've been stuck on v372.70 cause they can't be bothered to do basic stability testing for their drivers now.
Sure the BSOD of AMD is the definition of stability.
Maybe someone should ask them what they mean when they say "next one"?
IMHO, they don't even know when are released next driver (hence or following). Various branches, Q&A make it hard to predict. Honestly, I hope they will release at least this year.
IMHO, they don't even know when are released next driver (hence or following). Various branches, Q&A make it hard to predict. Honestly, I hope they will release at least this year.
Maybe, and maybe not. We should at least ask.
Sure the BSOD of AMD is the definition of stability.
NVIDIA drivers have issues too: https://www.techpowerup.com/227881/users-report-multiple-issues-with-geforce-375-86-whql-drivers And a little more older and more serious one (bricked GPUs): http://wccftech.com/nvidia-users-beware-latest-drivers-damage-pc/ So yep, I like more BSOD than bricked up GPU...
IMHO, they don't even know when are released next driver (hence or following). Various branches, Q&A make it hard to predict. Honestly, I hope they will release at least this year.
Well, I'd guess by next driver or the one after they mean a driver version in the next month or the month after. But if AMD is still typical AMD it'll be the month after the month after or the month after that. :p
@Nucleoprotein Nvidia issues don't make AMD driver more stable ;)
@FlatOutPS2 yes I agree with you that why I wrote couple of day ago
But for a kernel crash they will likely release it faster. So let's wait a month.
Follows #1508 and hrydgard/ppsspp#8698 Gs dump is here.
Bissected up to either 16c2baa0df2d7859619d51d3995b78f057a8e965 or 29c97a9bf21a985e1524e0b428ff97aa678adcc4 Happens in Ace Combat 5 after "press start" screen only after Blending Unit Accuracy has been set to none in OGL hw.
I'd just complain over at AMD but I'd like to get a more straightforward testcase for them. Wouldn't be bad if somebody added "Upstream | External" label
List of AMD issues:
Links to AMD forum issue threads: https://community.amd.com/message/2748362 https://community.amd.com/message/2756964
Possible BSOD Citra Workaround Merged from issue #2362 As Gregory requested so we don't forget about it.
Currently Citra added a workaround for the amdfail driver that fixes the crashing caused by SSO. The commit is located here https://github.com/citra-emu/citra/pull/3499/commits/0cf6793622b01f3941fbc77fe04c3b68476004ca
Reddit post: https://www.reddit.com/r/emulation/comments/88vva4/citra_on_twitter_new_update_to_the_hardware/
Idea would be for this to be checked out and maybe implemented.
Some useful info