Open dankcushions opened 7 years ago
I have several devices where GLES 2.0 still works correctly. My gut guess would be that it's the YUV shader changes that are breaking the RPI. Second possibility would be the shader "performance improvement" that we had a little while back.
Dunno if this adds any insight - but when loading, a message "INIT NOISE TEXTURES" appears, which did not appear before these glitches started happening.
also - issue started no earlier than this month (nov 2017)
Quite recent. Have you tried to enable legacy blending?
@dankcushions do you have access to the GLideN64 log file? Do you see any errors there?
Have you tried to enable legacy blending?
i already had it enabled.
do you have access to the GLideN64 log file? Do you see any errors there?
i will check these out tonight!
A user on rpi forum asserts the following regarding log file in association with this issue:
"the glide log shows these two lines constantly repeating: Async color buffer copies are not supported on GLES2 LOD emulation not possible on this device"
I don't have my Raspberry Pi anymore but I'll try and get the GLES2 version running on my laptop to see if I can reproduce this tomorrow
@cptbananas those 2 warnings are not related to this issue, it is just warning about a couple features that aren't supported on the Raspberry Pi
@loganmc10 I have tried it in a couple GLES 2.0 devices and I didn't have this issue. This may be RPI specific, maybe their GPU drivers specific.
Probably the case, the GPU drivers on the Raspberry Pi are garbage. On Linux I can actually disable certain extensions so I'll try and "simulate" the RPI as close as possible, but it may be a driver bug
I have no problem with vc4 mesa driver. So the vc4 firmware blob has a problem or a recent change is not compatible with VC hacks/macros. I will bisect if i have some time.
Ah ok good to know, does RetroPie have plans to start using the Mesa driver?
As an option, but not yet.
If anyone wants to try to bisect this, here are 6 commits to check:
This seems be be the offending commit causing the plugin to fail on Pi: c5c0fbeb6fe3ea6b0de8c6353910dc6bdb038ec5
That's a good find. That would imply that the issue should be fixed if you disable shader storage. Have you tried that?
I've just checked mupen64plus.cfg, and EnableShadersStorage was already set to "False". If you were referring to something else, then I apologize.
Yeah, that's what I meant. It may take some debugging work an actual device to see what's wrong.
It seems to work fine in other GLES 2.0 devices that don't support shader storage.
Looking at bad commit, I see a possible inconsistency in shader cache behaviour - it tries to load/save the cache depending on whether the feature is supported, without looking at whether we want the feature to be used via the current configuration. This patch fixes the segfault and doesn't show any graphical distortion when applied to c5c0fbeb6fe3ea6b0de8c6353910dc6bdb038ec5:
diff --git a/src/Graphics/OpenGLContext/GLSL/glsl_ShaderStorage.cpp b/src/Graphics/OpenGLContext/GLSL/glsl_ShaderStorage.cpp
index 821bd3b..58869b8 100644
--- a/src/Graphics/OpenGLContext/GLSL/glsl_ShaderStorage.cpp
+++ b/src/Graphics/OpenGLContext/GLSL/glsl_ShaderStorage.cpp
@@ -114,7 +114,7 @@ bool ShaderStorage::saveShadersStorage(const graphics::Combiners & _combiners) c
if (!_saveCombinerKeys(_combiners))
return false;
- if (!gfxContext.isSupported(graphics::SpecialFeatures::ShaderProgramBinary))
+ if (!CombinerInfo::get().isShaderCacheSupported())
// Shaders storage is not supported, but we saved combiners keys.
return true;
@@ -263,7 +263,7 @@ bool ShaderStorage::_loadFromCombinerKeys(graphics::Combiners & _combiners)
if (opengl::Utils::isGLError())
return false;
- if (gfxContext.isSupported(graphics::SpecialFeatures::ShaderProgramBinary))
+ if (CombinerInfo::get().isShaderCacheSupported())
// Restore shaders storage
return saveShadersStorage(_combiners);
@@ -273,7 +273,7 @@ bool ShaderStorage::_loadFromCombinerKeys(graphics::Combiners & _combiners)
bool ShaderStorage::loadShadersStorage(graphics::Combiners & _combiners)
{
- if (!gfxContext.isSupported(graphics::SpecialFeatures::ShaderProgramBinary))
+ if (!CombinerInfo::get().isShaderCacheSupported())
// Shaders storage is not supported, load from combiners keys.
return _loadFromCombinerKeys(_combiners);
However, applying this against the latest master commit doesn't resolve the issue of missing textures.
Update: the above issue with the shader cache may not be directly related to this issue (but perhaps needs to be fixed also?). That commit introduced a segmentation fault.
This commit is actually what introduced corrupted textures: 77ffe61 More debugging is needed, as it can't be cleanly reverted.
77ffe61 Can't introduce corrupted textures. It splits gDPInfo::OtherMode::textureConvert 3bit field on 3 1bit fields. gDPInfo::OtherMode::textureConvert was not used for rendering, and this commit does not change the situation.
Next commit, adc2b547 do use these bits. @fzurita tested in on GLES2 devices, and this commit was included to master. adc2b547 introduces new shader code. This code includes conversions from float to int and back and probably it causes problem on RPI. From the other side, this new code YUV_Convert is rarely used. Most of games should never call it. You may try to force disable any calls of it: in glsl_CombinerProgramBuilder.cpp, in ShaderFragmentReadTex0 change
" if (uBiLerp[0] != 0) \n"
" readtex0 = readTex(uTex0, vTexCoord0, uFbMonochrome[0], uFbFixedAlpha[0]); \n"
" else \n"
" readtex0 = YUV_Convert(uTex0, vTexCoord0, uTextureConvert[0], uTextureFormat[0], readtex0); \n"
by
" readtex0 = readTex(uTex0, vTexCoord0, uFbMonochrome[0], uFbFixedAlpha[0]); \n"
Do the same for ShaderFragmentReadTex1
@gonetz
You're right - adc2b547e881d792da31575c8e1e502558b0f9e4 is the actual commit causing the problem. I misidentified it due to the shader cache from the previous build being loaded each time.
Unfortunately, your proposed change was not enough to fix the texture corruption - there's no change to the texture corruption after applying this change - and I made sure to delete the shader cache before testing.
diff --git a/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp b/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
index ae8c2e7..19d314d 100644
--- a/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
+++ b/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
@@ -1167,10 +1167,7 @@ public:
m_part =
" nCurrentTile = 0; \n"
" lowp vec4 readtex0; \n"
- " if (uBiLerp[0] != 0) \n"
- " readtex0 = readTex(uTex0, vTexCoord0, uFbMonochrome[0], uFbFixedAlpha[0]); \n"
- " else \n"
- " readtex0 = YUV_Convert(uTex0, vTexCoord0, uTextureConvert[0], uTextureFormat[0], readtex0); \n"
+ " readtex0 = readTex(uTex0, vTexCoord0, uFbMonochrome[0], uFbFixedAlpha[0]); \n"
;
} else {
if (config.video.multisampling > 0) {
@@ -1205,10 +1202,7 @@ public:
m_part =
" nCurrentTile = 1; \n"
" lowp vec4 readtex1; \n"
- " if (uBiLerp[1] != 0) \n"
- " readtex1 = readTex(uTex1, vTexCoord1, uFbMonochrome[1], uFbFixedAlpha[1]); \n"
- " else \n"
- " readtex1 = YUV_Convert(uTex1, vTexCoord1, uTextureConvert[1], uTextureFormat[1], readtex0); \n"
+ " readtex1 = readTex(uTex1, vTexCoord1, uFbMonochrome[1], uFbFixedAlpha[1]); \n"
;
} else {
if (config.video.multisampling > 0) {
I tried applying this change against the head of this commit, and current master - doesn't help in either case.
@psyke83 Hmm. readTex function changed too by this commit. Previously it used either hardware bilinear filtering or shader filter3point. Later I added shader for bilinear filtering. May be it causes the issue. Try to revert readTex to old state: in class ShaderReadtex : public ShaderPart
m_part = "lowp vec4 readTex(in sampler2D tex, in mediump vec2 texCoord, in lowp int fbMonochrome, in lowp int fbFixedAlpha) \n"
"{ \n"
" lowp vec4 texColor = texture2D(tex, texCoord); \n"
" if (fbMonochrome == 1) texColor = vec4(texColor.r); \n"
" else if (fbMonochrome == 2) \n"
" texColor.rgb = vec3(dot(vec3(0.2126, 0.7152, 0.0722), texColor.rgb)); \n"
" if (fbFixedAlpha == 1) texColor.a = 0.825; \n"
" return texColor; \n"
"} \n"
;
If TextureFilter is really the problem, you will get textures, but only unfiltered, because hardware biliner filtering is disabled.
Unfortunately, I see no visual difference when readTex is changed - with or without the previous change that forces readTex in instead of of YUV_Convert.
P.S. I noticed the attribution in the comments of that commit to twinaphex; that's interesting, as I think that lr-mupen64plus may have similar graphical corruption when the gliden64 option is selected in the core options. I don't have it installed this moment to confirm for certain though.
@psyke83 Ok
If it will help - need to carefuly check what was changed in Textures.cpp and return these changes step by step until you will find what breaks texturing.
I won't have time to return to testing for a few hours, but so far, this is what gets textures to display when applied against adc2b54 or master:
diff --git a/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp b/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
index 69ec065..39f00a3 100644
--- a/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
+++ b/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
@@ -947,7 +947,7 @@ public:
" name = vec4(iconvert)/255.0; \\\n"
" } \n"
;
- if (config.texture.bilinearMode == BILINEAR_3POINT) {
+ if (1) {
// 3 point texture filtering.
// Original author: ArthurCarvalho
// GLSL implementation: twinaphex, mupen64plus-libretro project.
@@ -1102,10 +1102,7 @@ public:
m_part =
" nCurrentTile = 0; \n"
" lowp vec4 readtex0; \n"
- " if (uBiLerp[0] != 0) \n"
" readtex0 = readTex(uTex0, vTexCoord0, uFbMonochrome[0], uFbFixedAlpha[0]); \n"
- " else \n"
- " readtex0 = YUV_Convert(uTex0, vTexCoord0, uTextureConvert[0], uTextureFormat[0], readtex0); \n"
;
} else {
if (config.video.multisampling > 0) {
@@ -1140,10 +1137,7 @@ public:
m_part =
" nCurrentTile = 1; \n"
" lowp vec4 readtex1; \n"
- " if (uBiLerp[1] != 0) \n"
" readtex1 = readTex(uTex1, vTexCoord1, uFbMonochrome[1], uFbFixedAlpha[1]); \n"
- " else \n"
- " readtex1 = YUV_Convert(uTex1, vTexCoord1, uTextureConvert[1], uTextureFormat[1], readtex0); \n"
;
} else {
if (config.video.multisampling > 0) {
@@ -1537,7 +1531,7 @@ public:
" return vec4(iconvert)/255.0; \n"
" } \n"
;
- if (config.texture.bilinearMode == BILINEAR_3POINT) {
+ if (1) {
m_part +=
"uniform mediump vec2 uTextureSize[2]; \n"
// 3 point texture filtering.
@@ -1591,9 +1585,7 @@ public:
m_part +=
"lowp vec4 readTex(in sampler2D tex, in mediump vec2 texCoord, in lowp int fbMonochrome, in lowp int fbFixedAlpha) \n"
"{ \n"
- " lowp vec4 texColor; \n"
- " if (uTextureFilterMode == 0) texColor = texture2D(tex, texCoord); \n"
- " else texColor = TextureFilter(tex, texCoord); \n"
+ " lowp vec4 texColor = texture2D(tex, texCoord); \n"
" if (fbMonochrome == 1) texColor = vec4(texColor.r); \n"
" else if (fbMonochrome == 2) \n"
" texColor.rgb = vec3(dot(vec3(0.2126, 0.7152, 0.0722), texColor.rgb)); \n"
Using TextureFilter in readTex() as you earlier suggested didn't work, and for some reason, the standard bilinear filtering code path needed to be forced, or else the texture problem persisted.
P.S. YUV_Convert does cause issues, but for the test case (Super Mario 64), it only seems to affect a few minor elements such as the star transition effect, missing "PRESS START", incorrect shadow rendering on characters. The majority of textures do display if that code is left in.
Are you able to view GLSL compiler errors generated by the driver?
They should get printed out into gliden64.log I believe
Ah, I figured it printed debug to stdout. These lines are repeated in my log:
$ cat /home/pi/.local/share/mupen64plus/gliden64.log | sort | uniq
Async color buffer copies are not supported on GLES2
LOD emulation not possible on this device
OpenGL Error: invalid operation (502)Async color buffer copies are not supported on GLES2
ShaderCacheSupported?Async color buffer copies are not supported on GLES2
That should only get printed once per each time you start up the plugin. GLideN64 doesn't clear the log file, so each time you start the plugin, an additional log will be added.
After further experimenting, the texture issue can be worked around by setting 3-point bilinear filtering. Additionally, I'm seeing segmentation fault if a cached combiner key file is loaded, so that has to be disabled:
bilinearMode = False
EnableShadersStorage = False
I didn't change anything from master, and the YUV_Convert path either isn't getting used with 3-point filtering, or the issues I noticed previously were fixed in commits after adc2b54. Maybe I'm crazy, but performance seems reduced compared to older builds that didn't have the texture issue.
I'd be happy to test any other suggestions to help get the standard bilinear filtering working properly (if possible).
"Maybe I'm crazy, but performance seems reduced compared to older builds that didn't have the texture issue." I changed those settings in my config file, and I just tested Majora's Mask and Ocarina of Time, and both games drop down to 1 FPS after the N64 startup screen. Is that just because of disabling shader storage, and isn't a real issue?
OK, YUV_Convert seems to be the culprit for Majora's Mask.
On latest master + my configuration above, it gets stuck on a black screen after the N64 logo (not even 1FPS). The following seems to resolve the problem, so YUV_Convert is definitely causing trouble.
diff --git a/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp b/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
index ae8c2e7..9f61a39 100644
--- a/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
+++ b/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
@@ -1167,10 +1167,7 @@ public:
m_part =
" nCurrentTile = 0; \n"
" lowp vec4 readtex0;
\n"
- " if (uBiLerp[0] != 0) \n"
" readtex0 = readTex(uTex0, vTexCoord0, uFbMonochrome[0], uFbFixedAlpha[0]); \n"
- " else \n"
- " readtex0 = YUV_Convert(uTex0, vTexCoord0, uTextureConvert[0], uTextureFormat[0], readtex0); \n"
;
} else {
if (config.video.multisampling > 0) {
@@ -1205,10 +1202,7 @@ public:
m_part =
" nCurrentTile = 1; \n"
" lowp vec4 readtex1;
\n"
- " if (uBiLerp[1] != 0) \n"
" readtex1 = readTex(uTex1, vTexCoord1, uFbMonochrome[1], uFbFixedAlpha[1]); \n"
- " else \n"
- " readtex1 = YUV_Convert(uTex1, vTexCoord1, uTextureConvert[1], uTextureFormat[1], readtex0); \n"
;
} else {
if (config.video.multisampling > 0) {
Performance in Super Mario seems to improve, too.
I'm thinking that this is the culprit for this as well:
fzurita "I'm thinking that this is the culprit for this as well:
1676"
On that note, in my Pi 3, Goldeneye's textures aren't filtered, but every other game I've tried since changing the config settings seem to be correctly filtered.
This looks like a bug in compiler. YUV_Convert shader called only when uBiLerp[0] or uBiLerp[1] uniforms are zero. It is very rare case. YUV textures used by a few games. Zeldas do not use them. Thus YUV_Convert must never be called for these games. I tested under debugger, uBiLerp always contains only 1. I see the following possibility: bi_lerp0 and bi_lerp1 are bits in gDP.otherMode bitfield. convert_one also 3 bit integer. Probably, compiler does mistake when passes them as integer to uBiLerp.set() method. You may try to do the following:
const int bi_lerp0 = gDP.otherMode.bi_lerp0;
const int bi_lerp1 = gDP.otherMode.bi_lerp1;
uBiLerp.set(bi_lerp0, bi_lerp1, _force);
uTextureFormat.set(gSP.textureTile[0]->format, gSP.textureTile[1]->format, _force);
const int convert_one = gDP.otherMode.convert_one;
uTextureConvert.set(0, convert_one, _force);
Majora's Mask still gets stuck on a black screen with your suggested change. In case I screwed something up, the diff:
diff --git a/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramUniformFactory.cpp b/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramUniformFactory.cpp
index 46682ad..c670566 100644
--- a/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramUniformFactory.cpp
+++ b/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramUniformFactory.cpp
@@ -467,9 +467,12 @@ public:
if ((gSP.objRendermode&G_OBJRM_BILERP) != 0)
textureFilter |= 2;
uTextureFilterMode.set(textureFilter, _force);
- uBiLerp.set(gDP.otherMode.bi_lerp0, gDP.otherMode.bi_lerp1, _force);
+ const int bi_lerp0 = gDP.otherMode.bi_lerp0;
+ const int bi_lerp1 = gDP.otherMode.bi_lerp1;
+ uBiLerp.set(bi_lerp0, bi_lerp1, _force);
uTextureFormat.set(gSP.textureTile[0]->format, gSP.textureTile[1]->format, _force);
- uTextureConvert.set(0, gDP.otherMode.convert_one, _force);
+ const int convert_one = gDP.otherMode.convert_one;
+ uTextureConvert.set(0, convert_one, _force);
if (gDP.otherMode.bi_lerp0 == 0 || gDP.otherMode.bi_lerp1 == 0)
uConvertParams.set(gDP.convert.k0, gDP.convert.k1, gDP.convert.k2, gDP.convert.k3, _force);
}
I also tried changing the next line but it made no difference:
if (bi_lerp0 == 0 || bi_lerp1 == 0)
I just tested GLES2 on Linux and this is what I get:
shader_compile error: 0:128(20): preprocessor error: Error: macro TEX_OFFSET invoked with 3 arguments (expected 1)
0:129(20): preprocessor error: Error: macro TEX_OFFSET invoked with 3 arguments (expected 1)
0:131(20): preprocessor error: Error: macro TEX_OFFSET invoked with 3 arguments (expected 1)
0:132(20): preprocessor error: Error: macro TEX_OFFSET invoked with 3 arguments (expected 1)
This might fix your error (if not the bug itself)?
diff --git a/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp b/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
index ae8c2e7..4fca73c 100644
--- a/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
+++ b/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
@@ -1629,7 +1629,7 @@ public:
m_part +=
// bilinear filtering.
"uniform mediump vec2 uTextureSize[2]; \n"
- "#define TEX_OFFSET(off) texture2D(tex, texCoord - (off)/texSize) \n"
+ "#define TEX_OFFSET(off, tex, texCoord) texture2D(tex, texCoord - (off)/texSize) \n"
"lowp vec4 TextureFilter(in sampler2D tex, in mediump vec2 texCoord) \n"
"{ \n"
" mediump vec2 texSize; \n"
Will be able to test in a few hours.
Good news - the above patch fixes textures for standard bilinear filtering. Unfortunately, the black screen issue caused by YUV_Convert in Majora's Mask isn't fixed.
@loganmc10 @psyke83 : preprocessor error with macro TEX_OFFSET is fixed: 6845116856
Yep GLES2 on Linux looks good now, I have no issues with Majora's Mask either
Thanks!
Your patch fixed standard bilinear filtering, but on Pi 3, Majora's Mask is still freezing after the logo due to the YUV_Convert, and the shader cache still causes segfaults.
What's interesting is that if I force YUV_Convert on, Majora's Mask works (albeit with incorrect colours, but that's expected), so the mixing of readTex and YUV_Convert must be causing issues? Also, this line looks odd:
" readtex1 = YUV_Convert(uTex1, vTexCoord1, uTextureConvert[1], uTextureFormat[1], readtex0); \n"
Should the last argument not be readtex1? Invocations for the YUVCONVERT_TEX macro do the same for tex1, so maybe I'm wrong.
@Metalwario64 check #1676 I just found that Adreno silently ignored my filtering shader and returns unfiltered texture. Since current master uses only programmed filters, not hardware one - devices with Adreno have no texture filtering regardless of config settings. I guess, you have similar situation on Pi 3. I don't know how to fix the shaders to make them work with Adreno and Pi.
@psyke83
Your patch fixed standard bilinear filtering, but on Pi 3, Majora's Mask is still freezing after the logo due to the YUV_Convert, and the shader cache still causes segfaults.
First - disable shader cache for now. Second - Majora's Mask does not use YUV texture, there must be no calls to YUV_Convert after the logo. I tested this with debugger. Put breakpoint or log message to UTextureFetchMode::update in this place:
if (gDP.otherMode.bi_lerp0 == 0 || gDP.otherMode.bi_lerp1 == 0)
uConvertParams.set(gDP.convert.k0, gDP.convert.k1, gDP.convert.k2, gDP.convert.k3, _force);
This if must never trigger. YUV_Convert called only when gDP.otherMode.bi_lerp0 or gDP.otherMode.bi_lerp1 is zero.
Should the last argument not be readtex1?
No. If gDP.otherMode.convert_one is set, YUV_Convert converts tex0, not tex1. Thus, readtex0 is passed to YUV_Convert for that case.
@psyke83 also, please check that RDRAMSize is set to 8MB after RomOpen call. If it is set to 4MB you will get all kind of issues.
@gonetz,
I'm not disputing your claim that Majora's Mask isn't using YUV textures. To be sure, I verified that gDP.otherMode.bi_lerp0/gDP.otherMode.bi_lerp1 is never set to zero, but the game is still freezing nevertheless.
What I'm saying is that the shader seems to be breaking due to the if / else condition on these lines:
" if (uBiLerp[0] != 0) \n"
" readtex0 = readTex(uTex0, vTexCoord0, uFbMonochrome[0], uFbFixedAlpha[0]); \n"
" else \n"
" readtex0 = YUV_Convert(uTex0, vTexCoord0, uTextureConvert[0], uTextureFormat[0], readtex0); \n"
" if (uBiLerp[1] != 0) \n"
" readtex1 = readTex(uTex1, vTexCoord1, uFbMonochrome[1], uFbFixedAlpha[1]); \n"
" else \n"
" readtex1 = YUV_Convert(uTex1, vTexCoord1, uTextureConvert[1], uTextureFormat[1], readtex0); \n"
It's possible to resolve the hang by modifying those lines by:
So the hang is not caused by readTex or YUV_Convert directly. Instead, it seems like the if/else evaluation of uBiLerp in the shader code is causing trouble. Maybe there's a bad implementation in the Broadcom shader causing a syntax error?
P.S. I've got DisableExtraMem = False set, so it RDRAMSize should be OK...
It's possible to move the logic to being outside the shader. The YUV setting for the texture can then become an additional key to the shader.
@gonetz I have an idea. We can get get rid of all logic in shaders that doesn't have to be done on a per pixel level in the shaders. All that logic could be done externally and we can have more shader keys.
That would mean that shaders would become simpler, so less chance of bad drivers having issues.
This couldn't be done for all shader logic though. For example, I don't think a depth test could be done outside the shader.
If you are too busy, I could give that a shot.
I presume it's to do with the shader cache changes but haven't had a chance to bisect yet. Is this RPI specific or all GLES2?