Closed cmitu closed 2 years ago
Oh no, I really thought that change would be compatible enough, it worked on the lowliest of Android devices...
Would have expected some clearer error in the log though.. Might need to fix some error reporting there.
What does it look like from the standalone build?
Doesn't show much, to be honest. Here's a log from the standalone emulator on ce2995f95273d06c913526553489ac8d203752d7:
It doesn't appear to be a shader compilation error then - I just tried to break the syntax of the shader and we do report lots of errors in that case.
We are using values that are larger than 1.0 in the rgb2hsv/hsv2rgb functions, maybe it has very limited color precision in the fragment shader, or something...
I think it's a driver bug (or something in the Broadcom GLES implementation) that's broken.
The application hangs, it can still be stopped via Ctrl + C
or kill
, but any subsequent start of an application (ppsspp
or other app) that calls the GLES stack will also hang - only a reboot 'clears' the GPU state.
There's not much visibility into the driver (since it's closed source), but if I look into the allocated GPU memory with sudo vcdbg reloc
, I can see the shader related storage is still present after the application is forcibly stopped, so something on the GPU side is crashing/breaking and ppsspp
just hangs waiting for the GPU calls to return.
Sigh. This is why we can't have nice things.
If you reverse just this part of that commit: https://github.com/hrydgard/ppsspp/commit/ce2995f95273d06c913526553489ac8d203752d7#diff-45024aa56af7d1759a7126a2d809fad122fe5c86c71fe358b93081df8b969364R285-R421
Does it work? If yes, undo, and we'll focus on this section: https://github.com/hrydgard/ppsspp/commit/ce2995f95273d06c913526553489ac8d203752d7#diff-45024aa56af7d1759a7126a2d809fad122fe5c86c71fe358b93081df8b969364R287-R319
Try replacing these lines:
void main() {
gl_Position = WorldViewProj * vec4(Position, 1.0);
vec3 hsv = rgb2hsv(Color0.xyz);
hsv.x += TintSaturation.x;
hsv.y *= TintSaturation.y;
oColor0 = vec4(hsv2rgb(hsv), Color0.w);
oTexCoord0 = TexCoord0;
})",
With:
void main() {
gl_Position = WorldViewProj * vec4(Position, 1.0);
oColor0 = Color0;
oTexCoord0 = TexCoord0;
})",
This will also still undo the change, but it keeps the (now unused) functions. If that avoids the crash, then we've narrowed it down to the functions being used, not just parsed/compiled.
Next would be to try replacing that same part with:
void main() {
gl_Position = WorldViewProj * vec4(Position, 1.0);
vec3 hsv = rgb2hsv(Color0.xyz);
hsv.x += TintSaturation.x;
hsv.y *= TintSaturation.y;
oColor0 = vec4(hsv.xyz, Color0.w);
oTexCoord0 = TexCoord0;
})",
This will either also crash, or it will look kinda weird and discolored. The latter is a positive result, and possibly narrows it down to hsv2rgb()
. Either way, the next thing to try is:
void main() {
gl_Position = WorldViewProj * vec4(Position, 1.0);
vec3 hsv = Color0.xyz;
hsv.x += TintSaturation.x;
hsv.y *= TintSaturation.y;
oColor0 = vec4(hsv2rgb(hsv), Color0.w);
oTexCoord0 = TexCoord0;
})",
This should also be discolored if it doesn't crash, but in a different way.
Knowing which (or both or none) of the above cases crash would narrow it down to specific lines and help get this fixed.
-[Unknown]
If you reverse just this part of that commit: ce2995f#diff-45024aa56af7d1759a7126a2d809fad122fe5c86c71fe358b93081df8b969364R285-R421
Does it work?
Yes, reverting to the previous version of the shader works.
Try replacing these lines:
void main() { gl_Position = WorldViewProj * vec4(Position, 1.0); vec3 hsv = rgb2hsv(Color0.xyz); hsv.x += TintSaturation.x; hsv.y *= TintSaturation.y; oColor0 = vec4(hsv2rgb(hsv), Color0.w); oTexCoord0 = TexCoord0; })",
With:
void main() { gl_Position = WorldViewProj * vec4(Position, 1.0); oColor0 = Color0; oTexCoord0 = TexCoord0; })",
This will also still undo the change, but it keeps the (now unused) functions. If that avoids the crash, then we've narrowed it down to the functions being used, not just parsed/compiled.
This also works. Having the functions defined, but unused works fine.
Next would be to try replacing that same part with:
void main() { gl_Position = WorldViewProj * vec4(Position, 1.0); vec3 hsv = rgb2hsv(Color0.xyz); hsv.x += TintSaturation.x; hsv.y *= TintSaturation.y; oColor0 = vec4(hsv.xyz, Color0.w); oTexCoord0 = TexCoord0; })",
This will either also crash, or it will look kinda weird and discolored. The latter is a positive result, and possibly narrows it down to
hsv2rgb()
.
This leads to the crash - looks like the culprit may be the rgb2hsv
shader function.
I wonder what line of that function might be causing the issues, could you try to change it to these and see which one crash and which does not (you only need to change the code in the GLSL_1xx
shader).
1)
vec3 rgb2hsv(vec3 c) {
vec4 K = vec4(0.0, -1.0 / 3.0, 2.0 / 3.0, -1.0);
return K.rgb;
}
2)
vec3 rgb2hsv(vec3 c) {
vec4 K = vec4(0.0, -1.0 / 3.0, 2.0 / 3.0, -1.0);
vec4 p = mix(vec4(c.bg, K.wz), vec4(c.gb, K.xy), step(c.b, c.g));
return p.rgb;
}
3)
vec3 rgb2hsv(vec3 c) {
vec4 K = vec4(0.0, -1.0 / 3.0, 2.0 / 3.0, -1.0);
vec4 p = mix(vec4(c.bg, K.wz), vec4(c.gb, K.xy), step(c.b, c.g));
vec4 q = mix(vec4(p.xyw, c.r), vec4(c.r, p.yzx), step(p.x, c.r));
float d = q.x - min(q.w, q.y);
return vec3(d);
}
4)
vec3 rgb2hsv(vec3 c) {
vec4 K = vec4(0.0, -1.0 / 3.0, 2.0 / 3.0, -1.0);
vec4 p = mix(vec4(c.bg, K.wz), vec4(c.gb, K.xy), step(c.b, c.g));
vec4 q = mix(vec4(p.xyw, c.r), vec4(c.r, p.yzx), step(p.x, c.r));
float d = q.x - min(q.w, q.y);
float e = 1.0e-10;
return vec3(e);
}
If 2 is the first one crashing I wonder if this work: 5)
vec3 rgb2hsv(vec3 c) {
vec4 K = vec4(0.0, -1.0 / 3.0, 2.0 / 3.0, -1.0);
vec4 p = mix(vec4(c.b, c.g, K.w, K.z), vec4(c.g, c.b, K.x, K.y), step(c.b, c.g));
vec4 q = mix(vec4(p.x, p.y. p.w, c.r), vec4(c.r, p.y, p.z. p.x), step(p.x, c.r));
float d = q.x - min(q.w, q.y);
float e = 1.0e-10;
return vec3(abs(q.z + (q.w - q.y) / (6.0 * d + e)), d / (q.x + e), q.x);
}
If 4 is the first one crashing try this (maybe even try 0.01
):
6)
vec3 rgb2hsv(vec3 c) {
vec4 K = vec4(0.0, -1.0 / 3.0, 2.0 / 3.0, -1.0);
vec4 p = mix(vec4(c.bg, K.wz), vec4(c.gb, K.xy), step(c.b, c.g));
vec4 q = mix(vec4(p.xyw, c.r), vec4(c.r, p.yzx), step(p.x, c.r));
float d = q.x - min(q.w, q.y);
float e = 0.0000000001;
return vec3(abs(q.z + (q.w - q.y) / (6.0 * d + e)), d / (q.x + e), q.x);
}
@iota97 thank you for pushing me further to test.
I've started with 1.
(which although didn't produce any image, didn't get the GPU stuck), but quickly got bitten by the original problem with 2.
test.
The only extra info I've got from the VC logs is an assert triggered in the firmware driver, but it's completely opaque:
147731.732: assert( *input_detail.op == ~0 ) failed; ../../../../../middleware/khronos/glsl/2708/glsl_allocator_4.c::commit_input_choice line 1384 rev 71bd310
vcdbg_ctx_get_dump_stack: dump_stack failed
----------------
It may be that the shader itself is ok, but there's some issues when it's called multiple times (?).
FWIW I've re-compiled ppsspp
on the the current RasPI OS release (bullseye), which deprecated the old/legacy Broadcom firmware based drivers and uses the Mesa VC4 driver (open source). The error is not present there, so only the users of the previous RaspiOS/Raspbian (buster) would be affected.
Would be great if you could make a few more test:
1) This should be as easy as it gets (if this crash I have no idea tbh)
vec3 rgb2hsv(vec3 c) {
vec4 K = vec4(0.0, -1.0 / 3.0, 2.0 / 3.0, -1.0);
vec4 p = vec4(c.b, c.g, K.w, K.z);
if (c.g >= c.b)
p = vec4(c.g, c.b, K.x, K.y);
vec4 q = vec4(p.x, p.y, p.w, c.r);
if (c.r >= p.x)
q = vec4(c.r, p.y, p.z, p.x);
float d = q.x - min(q.w, q.y);
float e = 1.0e-10;
return vec3(abs(q.z + (q.w - q.y) / (6.0 * d + e)), d / (q.x + e), q.x);
}
2) Same as before, less ugly (might be a bit slower than main, but could be used in the end).
vec3 rgb2hsv(vec3 c) {
vec4 K = vec4(0.0, -1.0 / 3.0, 2.0 / 3.0, -1.0);
vec4 p = (c.g < c.b) ? vec4(c.b, c.g, K.w, K.z) : vec4(c.g, c.b, K.x, K.y);
vec4 q = (c.r < p.x) ? vec4(p.x, p.y, p.w, c.r) : vec4(c.r, p.y, p.z, p.x);
float d = q.x - min(q.w, q.y);
float e = 1.0e-10;
return vec3(abs(q.z + (q.w - q.y) / (6.0 * d + e)), d / (q.x + e), q.x);
}
3) If this crash the problem is with mix/step, if not is with the vec4 constructions using others vec (I made a silly typo in 5.
on previous post, this is a bit different, sorry...)
vec3 rgb2hsv(vec3 c) {
vec4 K = vec4(0.0, -1.0 / 3.0, 2.0 / 3.0, -1.0);
vec4 p = mix(vec4(c.b, c.g, K.w, K.z), vec4(c.g, c.b, K.x, K.y), step(c.b, c.g));
vec4 q = mix(vec4(p.x, p.y, p.w, c.r), vec4(c.r, p.y, p.z, p.x), step(p.x, c.r));
float d = q.x - min(q.w, q.y);
float e = 1.0e-10;
return vec3(abs(q.z + (q.w - q.y) / (6.0 * d + e)), d / (q.x + e), q.x);
}
Would be great if you could make a few more test:
Unfortunately, none of the 3 variants seems to make a difference - same behavior with ppsspp
getting stuck because the GPU stall.
In the worst case, we could disable this feature when using a Broadcom driver and use shaders without the hue/tint stuff.
What if you use this version? https://github.com/sebastien-lempens/webgl-boxing-gym/blob/3ea075847dcf0de9fcb099931db91e9ca40a52ef/assets/shaders/color/space/rgb2hsv.glsl#L10
Or this version: https://github.com/djulien/FriendsWithGpu-WS281X-DEV/blob/1a0b8396dc49837f306c71ae335ca344148147ba/demos/shared/vertex.glsl#L198-L210 You'll also need these lines above it: https://github.com/djulien/FriendsWithGpu-WS281X-DEV/blob/1a0b8396dc49837f306c71ae335ca344148147ba/demos/shared/shared.glsl#L75 https://github.com/djulien/FriendsWithGpu-WS281X-DEV/blob/1a0b8396dc49837f306c71ae335ca344148147ba/demos/shared/shared.glsl#L102 https://github.com/djulien/FriendsWithGpu-WS281X-DEV/blob/1a0b8396dc49837f306c71ae335ca344148147ba/demos/shared/shared.glsl#L73 https://github.com/djulien/FriendsWithGpu-WS281X-DEV/blob/1a0b8396dc49837f306c71ae335ca344148147ba/demos/shared/shared.glsl#L86
I didn't look at licensing, just theorizing that it'd be pretty weird for us to be the first people to try to convert RGB to HSV within GLSL on a Broadcom GPU.
-[Unknown]
@unknownbrackets seems like the 2nd version doesn't crash.
First variant is very similar to what is currently used and I think I did try it during the 1st set of tests (the original from http://lolengine.net/blog/2013/07/27/rgb-to-hsv-in-glsl doesn't seem to be online anymore, though it's referenced in https://stackoverflow.com/questions/15095909/from-rgb-to-hsv-in-opengl-glsl by its author).
Thanks for going again over this - 'worst' case scenario would have been to just add some info to the next release notes advising users of the old Broacom GLES drivers of the issue. Since the new RaspiOS uses the MESA VC4 drivers, they'd be less and less users using those drivers and thus experiencing this issue.
I guess using a form of the 2nd variant you referenced may help some Pi3 users that are still using RaspiOS/Raspbian Buster and similar Pi Linux distros.
@cmitu Got a used rasp 3b on order, to be able to test this. Can you give me some links to an exact RaspiOS version that this will reproduce on?
Raspbian (Buster) version, from https://downloads.raspberrypi.org/raspios_oldstable_lite_armhf_latest.
OK, I struggled getting a desktop up on Lite so installed the full Buster image on an SD card. Then I followed our own instructions to the letter, things build but fail to link with the following, which surprised me:
[100%] Linking CXX executable PPSSPPSDL
/usr/bin/ld: lib/libCommon.a(VulkanRenderManager.cpp.o): in function `VKRGraphicsPipeline::Create(VulkanContext*)':
VulkanRenderManager.cpp:(.text+0x88): undefined reference to `__atomic_store_8'
/usr/bin/ld: VulkanRenderManager.cpp:(.text+0xbc): undefined reference to `__atomic_store_8'
/usr/bin/ld: VulkanRenderManager.cpp:(.text+0x100): undefined reference to `__atomic_store_8'
/usr/bin/ld: lib/libCommon.a(VulkanRenderManager.cpp.o): in function `VKRComputePipeline::Create(VulkanContext*)':
VulkanRenderManager.cpp:(.text+0x194): undefined reference to `__atomic_store_8'
/usr/bin/ld: VulkanRenderManager.cpp:(.text+0x1c8): undefined reference to `__atomic_store_8'
/usr/bin/ld: lib/libCore.a(PipelineManagerVulkan.cpp.o): in function `PipelineManagerVulkan::Clear()':
PipelineManagerVulkan.cpp:(.text+0x6ac): undefined reference to `__atomic_load_8'
/usr/bin/ld: lib/libCore.a(PipelineManagerVulkan.cpp.o): in function `PipelineManagerVulkan::GetOrCreatePipeline(VulkanRenderManager*, unsigned long long, unsigned long long, VulkanPipelineRasterStateKey const&, DecVtxFormat const*, VulkanVertexShader*, VulkanFragmentShader*, bool)':
How did you get past that @cmitu ?
OK that was easy, just had to link with -latomic . Will update the CMakeLists.txt.
But, it does work now, albeit slowly. Not able to reproduce the shader error :/
I might make another attempt with your linked image later, but it should be the same, right? How can I check the driver version?
Oh haha never mind, turns out I'm using llvmpipe, a software renderer. No wonder it's slow. Now the question is how I get the real OpenGL driver to work...
The real OpenGL driver works too. I enabled it by running raspi-config and picking Advanced/GL Driver/G3 GL (Full KMS).
I'm getting some interesting output though, we are picking OpenGL instead of OpenGL ES it seems:
ERROR: EGL Error EGL_BAD_ACCESS detected in file /home/pi/ppsspp/SDL/SDLGLGraphicsContext.cpp at line 269 (0x3002)
EGL ERROR: Unable to make GLES context current.
EGL_Init() failed
OpenGL 2.0 or higher.
So seems we tried GLES but fell "back" to desktop OpenGL.
In tools / System Information in PPSSPP, the driver shows up as:
VENDOR_BROADCOM VC4 V3D 2.1
So @cmitu , to debug this, I need exact instructions from you, how you reproduced it using that image I guess. What GPU driver you chose, are you running on a desktop, all the details from an empty image to repro. I'm ready with my raspberry 3b.
Thanks in advance!
My build steps:
git clone --depth 1 --recursive https://github.com/hrydgard/ppsspp
mkdir -p ppsspp/build && cd ppsspp/build
cmake -DCMAKE_TOOLCHAIN_FILE=cmake/Toolchains/raspberry.armv7.cmake -DARM_NO_VULKAN=ON ..
You'd need to the usual dependencies installed (libsdl2-dev libsnappy-dev libzip-dev zlib1g-dev libraspberrypi-dev
).
Check that - after building - PPSSPPSDL
is linked with the VC4 legacy libs:
$ ldd PPSSPPSDL | grep -i gles
# should output:
libbrcmGLESv2.so => /opt/vc/lib/libbrcmGLESv2.so (0x76d1b000)
A few other things:
PPSSPPSDL
from the console (i.e. no desktop installed), running from a SSH session works also.libsdl2
library I use (2.0.10) is built with RPI video driver support - I think it's enabled in the RasPI OS 'Buster' package, but you can build SDL2 with --enable-video-rpi
to make sure it's included.raspi-config
, this enables the MESA VC4 driver (which doesn't have this issue) and renders the legacy VC4 BRCM GLES drivers unusable. You can comment out the dtoverlay=vc4-kms-v3d
int /boot/config.txt
line to disable it or just use raspi-config
.sudo vcdb reloc
- it should show a lot of memory allocations when it does. If you intent to modify/compile to retest is better to reboot before running PPSSPPSDL
again to make sure the allocations are no longer present.I remember I had the same issue during linking (missing the atomic
lib ref), but I think I've used LDFLAGS
to set it. Building from the latest master doesn't show the error anymore.
Thanks for the steps and details! I will try this tomorrow. Yeah, I've fixed the atomic thing.
Oh yeah, I also have a Raspberry Pi 4 setup now, in case there are issues with those :)
OK, I tried the above on a fresh non-desktop image. The LDD check passes, but it hangs hard when running, as reported.
And indeed, if I hack out the HSV color shift, things do run, although for some reason text seems to be broken.
Anyway, thanks - that's enough of a repro to get started debugging. Though I somewhat kinda question the value of supporting such backwards drivers in the first place... And super frustrating to test since the raspberry pi just hangs when it doesn't like the shader.
Also not sure where to get the vcdb tool
It just gets more and more baffling, the slightest change takes it from running to crashing. And every time it crashes, reboot. Not much pattern in what changes it will accept, either, some complicated stuff is fine but absolutely trivial stuff is not. Maybe some exception during shader execution due to some values, but would be highly unusual.
I'm giving up for today, and I'll probably end up just providing a version of the shader that doesn't support tint for this driver. Very annoying and totally bizarre behavior :(
Also not sure where to get the vcdb tool
Sorry, that's my fault, the correct command is sudo vcdbg reloc
(part of the libraspberrypi-bin
package).
I'm giving up for today, and I'll probably end up just providing a version of the shader that doesn't support tint for this driver. Very annoying and totally bizarre behavior :(
I wouldn't mind just documenting this behaviour for the new release and leave the shader as is - the new RaspiOS version doesn't have this bizarre bug and it should work better (GLES-wise) than the old/legacy driver anyway. We could add a patch for old Raspbian Buster users in RetroPie to disable the shader, until we'll migrate to the Bullseye release.
Thank you for the support.
Are there any statistics or metrics on how many people are running which versions? How easy is it to update?
I'd worry if a bunch of people are unknowingly going to run into this issue when they update PPSSPP and maybe have trouble updating the OS. But if it's easy or people have already started upgrading, then maybe it's a problem that's fixing itself.
-[Unknown]
I can only speak about how things are work right now in RetroPie.
Are there any statistics or metrics on how many people are running which versions?
We (RetroPie) only support Buster for the time being. We pinned the version we distribute to v1.12.3 now, so users will not be impacted by a new PPSSPP version until we also validate and change the version we distribute. Any upgrade done by users right now will get them v1.12.3, but we don't know which (older) versions users may have.
How easy is it to update?
Speaking about Raspbian/RaspiOS, they don't officially support a rolling upgrade, so users will need to upgrade manually with a new image. Not sure about other RPI supporting distros (Ubuntu/Arch/Manjaro).
For RetroPie (which is built upon the Lite version of RaspiOS), updating PPSSPP
is done using the RetroPie menus and from the version we pin, so they should get a compatible/validated version depending on which OS/GPU drivers they use.
HTH
OK. Sounds like I'm gonna do the workaround then.
Still this absolutely baffles me, if the shader compiler is that unstable, how are any games running at all..
Now it doesn't crash anymore. Targeting further work to the next version, though might end up doing some before anyway.
Issue is resolved after the commits from #15615.
Apologies for taking so long to close the issue and thank you again for your support.
Game or games this happens in
None
What area of the game / PPSSPP
Starting a game, there's nothing displayed on the screen and the rendering isn't initialised.
A few RetroPie users reported an issue on the RPI 3B model [1], using Raspbian Buster and the (old/legacy) Broadcom Videocore GLES drivers. The issue (black screen/freeze and emulator not starting) happens with both the standalone PPSSPP emulator or the libretro core.
Tried a
git bisect
, which indicated ce2995f95273d06c913526553489ac8d203752d7 is the culprit.[1] https://retropie.org.uk/forum/topic/32423
What should happen
PPSSPP should start.
Logs
Verbose log from the libretro core with ce2995f95273d06c913526553489ac8d203752d7:
Platform
Raspberry Pi
Mobile phone model or graphics card
Raspberry Pi 3B, using a Broadcom VideoCore IV (VC4)
PPSSPP version affected
v1.12.3-947-gce2995f95 (commit ce2995f95273d)
Last working version
Commit e88ddbb42a8a8a869991
Graphics backend (3D API)
OpenGL / GLES
Checklist