gonetz / GLideN64

A new generation, open-source graphics plugin for N64 emulators.
Other
780 stars 181 forks source link

No textures and depth issues on Raspberry Pi 3 (GLES 2) #1665

Open dankcushions opened 7 years ago

dankcushions commented 7 years ago

I presume it's to do with the shader cache changes but haven't had a chance to bisect yet. Is this RPI specific or all GLES2?

fzurita commented 7 years ago

I have several devices where GLES 2.0 still works correctly. My gut guess would be that it's the YUV shader changes that are breaking the RPI. Second possibility would be the shader "performance improvement" that we had a little while back.

cptbananas commented 7 years ago

Dunno if this adds any insight - but when loading, a message "INIT NOISE TEXTURES" appears, which did not appear before these glitches started happening.

also - issue started no earlier than this month (nov 2017)

gonetz commented 7 years ago

Quite recent. Have you tried to enable legacy blending?

fzurita commented 7 years ago

@dankcushions do you have access to the GLideN64 log file? Do you see any errors there?

dankcushions commented 7 years ago

Have you tried to enable legacy blending?

i already had it enabled.

do you have access to the GLideN64 log file? Do you see any errors there?

i will check these out tonight!

cptbananas commented 7 years ago

A user on rpi forum asserts the following regarding log file in association with this issue:

"the glide log shows these two lines constantly repeating: Async color buffer copies are not supported on GLES2 LOD emulation not possible on this device"

loganmc10 commented 7 years ago

I don't have my Raspberry Pi anymore but I'll try and get the GLES2 version running on my laptop to see if I can reproduce this tomorrow

@cptbananas those 2 warnings are not related to this issue, it is just warning about a couple features that aren't supported on the Raspberry Pi

fzurita commented 7 years ago

@loganmc10 I have tried it in a couple GLES 2.0 devices and I didn't have this issue. This may be RPI specific, maybe their GPU drivers specific.

loganmc10 commented 7 years ago

Probably the case, the GPU drivers on the Raspberry Pi are garbage. On Linux I can actually disable certain extensions so I'll try and "simulate" the RPI as close as possible, but it may be a driver bug

gizmo98 commented 7 years ago

I have no problem with vc4 mesa driver. So the vc4 firmware blob has a problem or a recent change is not compatible with VC hacks/macros. I will bisect if i have some time.

loganmc10 commented 7 years ago

Ah ok good to know, does RetroPie have plans to start using the Mesa driver?

joolswills commented 7 years ago

As an option, but not yet.

johnmanko commented 7 years ago

https://github.com/RetroPie/RetroPie-Setup/issues/2207

fzurita commented 7 years ago

If anyone wants to try to bisect this, here are 6 commits to check:

psyke83 commented 7 years ago

This seems be be the offending commit causing the plugin to fail on Pi: c5c0fbeb6fe3ea6b0de8c6353910dc6bdb038ec5

fzurita commented 7 years ago

That's a good find. That would imply that the issue should be fixed if you disable shader storage. Have you tried that?

Metalwario64 commented 7 years ago

I've just checked mupen64plus.cfg, and EnableShadersStorage was already set to "False". If you were referring to something else, then I apologize.

fzurita commented 7 years ago

Yeah, that's what I meant. It may take some debugging work an actual device to see what's wrong.

It seems to work fine in other GLES 2.0 devices that don't support shader storage.

psyke83 commented 7 years ago

Looking at bad commit, I see a possible inconsistency in shader cache behaviour - it tries to load/save the cache depending on whether the feature is supported, without looking at whether we want the feature to be used via the current configuration. This patch fixes the segfault and doesn't show any graphical distortion when applied to c5c0fbeb6fe3ea6b0de8c6353910dc6bdb038ec5:

diff --git a/src/Graphics/OpenGLContext/GLSL/glsl_ShaderStorage.cpp b/src/Graphics/OpenGLContext/GLSL/glsl_ShaderStorage.cpp
index 821bd3b..58869b8 100644
--- a/src/Graphics/OpenGLContext/GLSL/glsl_ShaderStorage.cpp
+++ b/src/Graphics/OpenGLContext/GLSL/glsl_ShaderStorage.cpp
@@ -114,7 +114,7 @@ bool ShaderStorage::saveShadersStorage(const graphics::Combiners & _combiners) c
        if (!_saveCombinerKeys(_combiners))
                return false;

-       if (!gfxContext.isSupported(graphics::SpecialFeatures::ShaderProgramBinary))
+       if (!CombinerInfo::get().isShaderCacheSupported())
                // Shaders storage is not supported, but we saved combiners keys.
                return true;

@@ -263,7 +263,7 @@ bool ShaderStorage::_loadFromCombinerKeys(graphics::Combiners & _combiners)
        if (opengl::Utils::isGLError())
                return false;

-       if (gfxContext.isSupported(graphics::SpecialFeatures::ShaderProgramBinary))
+       if (CombinerInfo::get().isShaderCacheSupported())
                // Restore shaders storage
                return saveShadersStorage(_combiners);

@@ -273,7 +273,7 @@ bool ShaderStorage::_loadFromCombinerKeys(graphics::Combiners & _combiners)

 bool ShaderStorage::loadShadersStorage(graphics::Combiners & _combiners)
 {
-       if (!gfxContext.isSupported(graphics::SpecialFeatures::ShaderProgramBinary))
+       if (!CombinerInfo::get().isShaderCacheSupported())
                // Shaders storage is not supported, load from combiners keys.
                return _loadFromCombinerKeys(_combiners);

However, applying this against the latest master commit doesn't resolve the issue of missing textures.

psyke83 commented 7 years ago

Update: the above issue with the shader cache may not be directly related to this issue (but perhaps needs to be fixed also?). That commit introduced a segmentation fault.

This commit is actually what introduced corrupted textures: 77ffe61 More debugging is needed, as it can't be cleanly reverted.

gonetz commented 7 years ago

77ffe61 Can't introduce corrupted textures. It splits gDPInfo::OtherMode::textureConvert 3bit field on 3 1bit fields. gDPInfo::OtherMode::textureConvert was not used for rendering, and this commit does not change the situation.

Next commit, adc2b547 do use these bits. @fzurita tested in on GLES2 devices, and this commit was included to master. adc2b547 introduces new shader code. This code includes conversions from float to int and back and probably it causes problem on RPI. From the other side, this new code YUV_Convert is rarely used. Most of games should never call it. You may try to force disable any calls of it: in glsl_CombinerProgramBuilder.cpp, in ShaderFragmentReadTex0 change

"  if (uBiLerp[0] != 0)                                                             \n"
"    readtex0 = readTex(uTex0, vTexCoord0, uFbMonochrome[0], uFbFixedAlpha[0]);     \n"
"  else                                                                     \n"
"    readtex0 = YUV_Convert(uTex0, vTexCoord0, uTextureConvert[0], uTextureFormat[0], readtex0);    \n"

by " readtex0 = readTex(uTex0, vTexCoord0, uFbMonochrome[0], uFbFixedAlpha[0]); \n"

Do the same for ShaderFragmentReadTex1

psyke83 commented 7 years ago

@gonetz

You're right - adc2b547e881d792da31575c8e1e502558b0f9e4 is the actual commit causing the problem. I misidentified it due to the shader cache from the previous build being loaded each time.

Unfortunately, your proposed change was not enough to fix the texture corruption - there's no change to the texture corruption after applying this change - and I made sure to delete the shader cache before testing.

diff --git a/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp b/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
index ae8c2e7..19d314d 100644
--- a/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
+++ b/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
@@ -1167,10 +1167,7 @@ public:
                        m_part =
                                "  nCurrentTile = 0; \n"
                                "  lowp vec4 readtex0;                                                                                                                          \n"
-                               "  if (uBiLerp[0] != 0)                                                                                                                         \n"
-                               "    readtex0 = readTex(uTex0, vTexCoord0, uFbMonochrome[0], uFbFixedAlpha[0]);         \n"
-                               "  else                                                                                                                                                         \n"
-                               "    readtex0 = YUV_Convert(uTex0, vTexCoord0, uTextureConvert[0], uTextureFormat[0], readtex0);        \n"
+                               " readtex0 = readTex(uTex0, vTexCoord0, uFbMonochrome[0], uFbFixedAlpha[0]); \n"
                                ;
                } else {
                        if (config.video.multisampling > 0) {
@@ -1205,10 +1202,7 @@ public:
                        m_part =
                                "  nCurrentTile = 1; \n"
                                "  lowp vec4 readtex1;                                                                                                                          \n"
-                               "  if (uBiLerp[1] != 0)                                                                                                                         \n"
-                               "    readtex1 = readTex(uTex1, vTexCoord1, uFbMonochrome[1], uFbFixedAlpha[1]);         \n"
-                               "  else                                                                                                                                                         \n"
-                               "    readtex1 = YUV_Convert(uTex1, vTexCoord1, uTextureConvert[1], uTextureFormat[1], readtex0);        \n"
+                               " readtex1 = readTex(uTex1, vTexCoord1, uFbMonochrome[1], uFbFixedAlpha[1]); \n"
                                ;
                } else {
                        if (config.video.multisampling > 0) {

I tried applying this change against the head of this commit, and current master - doesn't help in either case.

gonetz commented 7 years ago

@psyke83 Hmm. readTex function changed too by this commit. Previously it used either hardware bilinear filtering or shader filter3point. Later I added shader for bilinear filtering. May be it causes the issue. Try to revert readTex to old state: in class ShaderReadtex : public ShaderPart

m_part =    "lowp vec4 readTex(in sampler2D tex, in mediump vec2 texCoord, in lowp int fbMonochrome, in lowp int fbFixedAlpha)  \n"
"{                                                                          \n"
"  lowp vec4 texColor = texture2D(tex, texCoord);                           \n"
"  if (fbMonochrome == 1) texColor = vec4(texColor.r);                      \n"
"  else if (fbMonochrome == 2)                                              \n"
"    texColor.rgb = vec3(dot(vec3(0.2126, 0.7152, 0.0722), texColor.rgb));  \n"
"  if (fbFixedAlpha == 1) texColor.a = 0.825;                               \n"
"  return texColor;                                                         \n"
"}                                                                          \n"
;

If TextureFilter is really the problem, you will get textures, but only unfiltered, because hardware biliner filtering is disabled.

psyke83 commented 7 years ago

Unfortunately, I see no visual difference when readTex is changed - with or without the previous change that forces readTex in instead of of YUV_Convert.

P.S. I noticed the attribution in the comments of that commit to twinaphex; that's interesting, as I think that lr-mupen64plus may have similar graphical corruption when the gliden64 option is selected in the core options. I don't have it installed this moment to confirm for certain though.

gonetz commented 7 years ago

@psyke83 Ok

If it will help - need to carefuly check what was changed in Textures.cpp and return these changes step by step until you will find what breaks texturing.

psyke83 commented 7 years ago

I won't have time to return to testing for a few hours, but so far, this is what gets textures to display when applied against adc2b54 or master:

diff --git a/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp b/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
index 69ec065..39f00a3 100644
--- a/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
+++ b/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
@@ -947,7 +947,7 @@ public:
                "  name = vec4(iconvert)/255.0;                                     \\\n"
                "  }                                                                \n"
                ;
-           if (config.texture.bilinearMode == BILINEAR_3POINT) {
+           if (1) {
                // 3 point texture filtering.
                // Original author: ArthurCarvalho
                // GLSL implementation: twinaphex, mupen64plus-libretro project.
@@ -1102,10 +1102,7 @@ public:
            m_part =
                "  nCurrentTile = 0; \n"
                "  lowp vec4 readtex0;                                                              \n"
-               "  if (uBiLerp[0] != 0)                                                             \n"
                "    readtex0 = readTex(uTex0, vTexCoord0, uFbMonochrome[0], uFbFixedAlpha[0]);     \n"
-               "  else                                                                             \n"
-               "    readtex0 = YUV_Convert(uTex0, vTexCoord0, uTextureConvert[0], uTextureFormat[0], readtex0);    \n"
                ;
        } else {
            if (config.video.multisampling > 0) {
@@ -1140,10 +1137,7 @@ public:
            m_part =
                "  nCurrentTile = 1; \n"
                "  lowp vec4 readtex1;                                                              \n"
-               "  if (uBiLerp[1] != 0)                                                             \n"
                "    readtex1 = readTex(uTex1, vTexCoord1, uFbMonochrome[1], uFbFixedAlpha[1]);     \n"
-               "  else                                                                             \n"
-               "    readtex1 = YUV_Convert(uTex1, vTexCoord1, uTextureConvert[1], uTextureFormat[1], readtex0);    \n"
                ;
        } else {
            if (config.video.multisampling > 0) {
@@ -1537,7 +1531,7 @@ public:
                "  return vec4(iconvert)/255.0;                                     \n"
                "  }                                                                \n"
            ;
-           if (config.texture.bilinearMode == BILINEAR_3POINT) {
+           if (1) {
                m_part +=
                    "uniform mediump vec2 uTextureSize[2];                                      \n"
                    // 3 point texture filtering.
@@ -1591,9 +1585,7 @@ public:
            m_part +=
                    "lowp vec4 readTex(in sampler2D tex, in mediump vec2 texCoord, in lowp int fbMonochrome, in lowp int fbFixedAlpha)  \n"
                    "{                                                                          \n"
-                   "  lowp vec4 texColor;                                                      \n"
-                   "  if (uTextureFilterMode == 0) texColor = texture2D(tex, texCoord);        \n"
-                   "  else texColor = TextureFilter(tex, texCoord);                            \n"
+                   "  lowp vec4 texColor = texture2D(tex, texCoord);                           \n"
                    "  if (fbMonochrome == 1) texColor = vec4(texColor.r);                      \n"
                    "  else if (fbMonochrome == 2)                                              \n"
                    "    texColor.rgb = vec3(dot(vec3(0.2126, 0.7152, 0.0722), texColor.rgb));  \n"

Using TextureFilter in readTex() as you earlier suggested didn't work, and for some reason, the standard bilinear filtering code path needed to be forced, or else the texture problem persisted.

P.S. YUV_Convert does cause issues, but for the test case (Super Mario 64), it only seems to affect a few minor elements such as the star transition effect, missing "PRESS START", incorrect shadow rendering on characters. The majority of textures do display if that code is left in.

fzurita commented 7 years ago

Are you able to view GLSL compiler errors generated by the driver?

loganmc10 commented 7 years ago

They should get printed out into gliden64.log I believe

psyke83 commented 7 years ago

Ah, I figured it printed debug to stdout. These lines are repeated in my log:

$ cat /home/pi/.local/share/mupen64plus/gliden64.log | sort | uniq
Async color buffer copies are not supported on GLES2
LOD emulation not possible on this device
OpenGL Error: invalid operation (502)Async color buffer copies are not supported on GLES2
ShaderCacheSupported?Async color buffer copies are not supported on GLES2
fzurita commented 7 years ago

That should only get printed once per each time you start up the plugin. GLideN64 doesn't clear the log file, so each time you start the plugin, an additional log will be added.

psyke83 commented 7 years ago

After further experimenting, the texture issue can be worked around by setting 3-point bilinear filtering. Additionally, I'm seeing segmentation fault if a cached combiner key file is loaded, so that has to be disabled:

bilinearMode = False
EnableShadersStorage = False

I didn't change anything from master, and the YUV_Convert path either isn't getting used with 3-point filtering, or the issues I noticed previously were fixed in commits after adc2b54. Maybe I'm crazy, but performance seems reduced compared to older builds that didn't have the texture issue.

I'd be happy to test any other suggestions to help get the standard bilinear filtering working properly (if possible).

Metalwario64 commented 7 years ago

"Maybe I'm crazy, but performance seems reduced compared to older builds that didn't have the texture issue." I changed those settings in my config file, and I just tested Majora's Mask and Ocarina of Time, and both games drop down to 1 FPS after the N64 startup screen. Is that just because of disabling shader storage, and isn't a real issue?

psyke83 commented 7 years ago

OK, YUV_Convert seems to be the culprit for Majora's Mask.

On latest master + my configuration above, it gets stuck on a black screen after the N64 logo (not even 1FPS). The following seems to resolve the problem, so YUV_Convert is definitely causing trouble.

diff --git a/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp b/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
index ae8c2e7..9f61a39 100644
--- a/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
+++ b/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
@@ -1167,10 +1167,7 @@ public:
                        m_part =
                                "  nCurrentTile = 0; \n"
                                "  lowp vec4 readtex0;
        \n"
-                               "  if (uBiLerp[0] != 0)                                                                                                                         \n"
                                "    readtex0 = readTex(uTex0, vTexCoord0, uFbMonochrome[0], uFbFixedAlpha[0]);         \n"
-                               "  else                                                                                                                                                         \n"
-                               "    readtex0 = YUV_Convert(uTex0, vTexCoord0, uTextureConvert[0], uTextureFormat[0], readtex0);        \n"
                                ;
                } else {
                        if (config.video.multisampling > 0) {
@@ -1205,10 +1202,7 @@ public:
                        m_part =
                                "  nCurrentTile = 1; \n"
                                "  lowp vec4 readtex1;
        \n"
-                               "  if (uBiLerp[1] != 0)                                                                                                                         \n"
                                "    readtex1 = readTex(uTex1, vTexCoord1, uFbMonochrome[1], uFbFixedAlpha[1]);         \n"
-                               "  else                                                                                                                                                         \n"
-                               "    readtex1 = YUV_Convert(uTex1, vTexCoord1, uTextureConvert[1], uTextureFormat[1], readtex0);        \n"
                                ;
                } else {
                        if (config.video.multisampling > 0) {

Performance in Super Mario seems to improve, too.

fzurita commented 7 years ago

I'm thinking that this is the culprit for this as well:

https://github.com/gonetz/GLideN64/issues/1676

Metalwario64 commented 7 years ago

fzurita "I'm thinking that this is the culprit for this as well:

1676"

On that note, in my Pi 3, Goldeneye's textures aren't filtered, but every other game I've tried since changing the config settings seem to be correctly filtered. goldenguy

gonetz commented 7 years ago

This looks like a bug in compiler. YUV_Convert shader called only when uBiLerp[0] or uBiLerp[1] uniforms are zero. It is very rare case. YUV textures used by a few games. Zeldas do not use them. Thus YUV_Convert must never be called for these games. I tested under debugger, uBiLerp always contains only 1. I see the following possibility: bi_lerp0 and bi_lerp1 are bits in gDP.otherMode bitfield. convert_one also 3 bit integer. Probably, compiler does mistake when passes them as integer to uBiLerp.set() method. You may try to do the following:

const int bi_lerp0 = gDP.otherMode.bi_lerp0;
const int bi_lerp1 = gDP.otherMode.bi_lerp1;
uBiLerp.set(bi_lerp0, bi_lerp1, _force);
uTextureFormat.set(gSP.textureTile[0]->format, gSP.textureTile[1]->format, _force);
const int convert_one = gDP.otherMode.convert_one;
uTextureConvert.set(0, convert_one, _force);
psyke83 commented 7 years ago

Majora's Mask still gets stuck on a black screen with your suggested change. In case I screwed something up, the diff:

diff --git a/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramUniformFactory.cpp b/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramUniformFactory.cpp
index 46682ad..c670566 100644
--- a/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramUniformFactory.cpp
+++ b/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramUniformFactory.cpp
@@ -467,9 +467,12 @@ public:
                if ((gSP.objRendermode&G_OBJRM_BILERP) != 0)
                        textureFilter |= 2;
                uTextureFilterMode.set(textureFilter, _force);
-               uBiLerp.set(gDP.otherMode.bi_lerp0, gDP.otherMode.bi_lerp1, _force);
+               const int bi_lerp0 = gDP.otherMode.bi_lerp0;
+               const int bi_lerp1 = gDP.otherMode.bi_lerp1;
+               uBiLerp.set(bi_lerp0, bi_lerp1, _force);
                uTextureFormat.set(gSP.textureTile[0]->format, gSP.textureTile[1]->format, _force);
-               uTextureConvert.set(0, gDP.otherMode.convert_one, _force);
+               const int convert_one = gDP.otherMode.convert_one;
+               uTextureConvert.set(0, convert_one, _force);
                if (gDP.otherMode.bi_lerp0 == 0 || gDP.otherMode.bi_lerp1 == 0)
                        uConvertParams.set(gDP.convert.k0, gDP.convert.k1, gDP.convert.k2, gDP.convert.k3, _force);
        }

I also tried changing the next line but it made no difference:

if (bi_lerp0 == 0 || bi_lerp1 == 0)
loganmc10 commented 6 years ago

I just tested GLES2 on Linux and this is what I get:

shader_compile error: 0:128(20): preprocessor error: Error: macro TEX_OFFSET invoked with 3 arguments (expected 1)

0:129(20): preprocessor error: Error: macro TEX_OFFSET invoked with 3 arguments (expected 1)

0:131(20): preprocessor error: Error: macro TEX_OFFSET invoked with 3 arguments (expected 1)

0:132(20): preprocessor error: Error: macro TEX_OFFSET invoked with 3 arguments (expected 1)
psyke83 commented 6 years ago

This might fix your error (if not the bug itself)?

diff --git a/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp b/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
index ae8c2e7..4fca73c 100644
--- a/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
+++ b/src/Graphics/OpenGLContext/GLSL/glsl_CombinerProgramBuilder.cpp
@@ -1629,7 +1629,7 @@ public:
                                m_part +=
                                        // bilinear filtering.
                                        "uniform mediump vec2 uTextureSize[2];                                                                          \n"
-                                       "#define TEX_OFFSET(off) texture2D(tex, texCoord - (off)/texSize)                       \n"
+                                       "#define TEX_OFFSET(off, tex, texCoord) texture2D(tex, texCoord - (off)/texSize)                        \n"
                                        "lowp vec4 TextureFilter(in sampler2D tex, in mediump vec2 texCoord)            \n"
                                        "{                                                                                                                                                      \n"
                                        "  mediump vec2 texSize;                                                                                                        \n"

Will be able to test in a few hours.

psyke83 commented 6 years ago

Good news - the above patch fixes textures for standard bilinear filtering. Unfortunately, the black screen issue caused by YUV_Convert in Majora's Mask isn't fixed.

gonetz commented 6 years ago

@loganmc10 @psyke83 : preprocessor error with macro TEX_OFFSET is fixed: 6845116856

loganmc10 commented 6 years ago

Yep GLES2 on Linux looks good now, I have no issues with Majora's Mask either

gonetz commented 6 years ago

Thanks!

psyke83 commented 6 years ago

Your patch fixed standard bilinear filtering, but on Pi 3, Majora's Mask is still freezing after the logo due to the YUV_Convert, and the shader cache still causes segfaults.

What's interesting is that if I force YUV_Convert on, Majora's Mask works (albeit with incorrect colours, but that's expected), so the mixing of readTex and YUV_Convert must be causing issues? Also, this line looks odd:

"    readtex1 = YUV_Convert(uTex1, vTexCoord1, uTextureConvert[1], uTextureFormat[1], readtex0);        \n"

Should the last argument not be readtex1? Invocations for the YUVCONVERT_TEX macro do the same for tex1, so maybe I'm wrong.

gonetz commented 6 years ago

@Metalwario64 check #1676 I just found that Adreno silently ignored my filtering shader and returns unfiltered texture. Since current master uses only programmed filters, not hardware one - devices with Adreno have no texture filtering regardless of config settings. I guess, you have similar situation on Pi 3. I don't know how to fix the shaders to make them work with Adreno and Pi.

gonetz commented 6 years ago

@psyke83

Your patch fixed standard bilinear filtering, but on Pi 3, Majora's Mask is still freezing after the logo due to the YUV_Convert, and the shader cache still causes segfaults.

First - disable shader cache for now. Second - Majora's Mask does not use YUV texture, there must be no calls to YUV_Convert after the logo. I tested this with debugger. Put breakpoint or log message to UTextureFetchMode::update in this place:

    if (gDP.otherMode.bi_lerp0 == 0 || gDP.otherMode.bi_lerp1 == 0)
        uConvertParams.set(gDP.convert.k0, gDP.convert.k1, gDP.convert.k2, gDP.convert.k3, _force);

This if must never trigger. YUV_Convert called only when gDP.otherMode.bi_lerp0 or gDP.otherMode.bi_lerp1 is zero.

Should the last argument not be readtex1?

No. If gDP.otherMode.convert_one is set, YUV_Convert converts tex0, not tex1. Thus, readtex0 is passed to YUV_Convert for that case.

gonetz commented 6 years ago

@psyke83 also, please check that RDRAMSize is set to 8MB after RomOpen call. If it is set to 4MB you will get all kind of issues.

psyke83 commented 6 years ago

@gonetz,

I'm not disputing your claim that Majora's Mask isn't using YUV textures. To be sure, I verified that gDP.otherMode.bi_lerp0/gDP.otherMode.bi_lerp1 is never set to zero, but the game is still freezing nevertheless.

What I'm saying is that the shader seems to be breaking due to the if / else condition on these lines:

"  if (uBiLerp[0] != 0)                                                                                                                \n"
"    readtex0 = readTex(uTex0, vTexCoord0, uFbMonochrome[0], uFbFixedAlpha[0]);         \n"
"  else                                                                                                                                \n"
"    readtex0 = YUV_Convert(uTex0, vTexCoord0, uTextureConvert[0], uTextureFormat[0], readtex0);        \n"
"  if (uBiLerp[1] != 0)                                                                                                                \n"
"    readtex1 = readTex(uTex1, vTexCoord1, uFbMonochrome[1], uFbFixedAlpha[1]);         \n"
"  else                                                                                                                                \n"
"    readtex1 = YUV_Convert(uTex1, vTexCoord1, uTextureConvert[1], uTextureFormat[1], readtex0);        \n"

It's possible to resolve the hang by modifying those lines by:

So the hang is not caused by readTex or YUV_Convert directly. Instead, it seems like the if/else evaluation of uBiLerp in the shader code is causing trouble. Maybe there's a bad implementation in the Broadcom shader causing a syntax error?

P.S. I've got DisableExtraMem = False set, so it RDRAMSize should be OK...

fzurita commented 6 years ago

It's possible to move the logic to being outside the shader. The YUV setting for the texture can then become an additional key to the shader.

fzurita commented 6 years ago

@gonetz I have an idea. We can get get rid of all logic in shaders that doesn't have to be done on a per pixel level in the shaders. All that logic could be done externally and we can have more shader keys.

That would mean that shaders would become simpler, so less chance of bad drivers having issues.

This couldn't be done for all shader logic though. For example, I don't think a depth test could be done outside the shader.

If you are too busy, I could give that a shot.