BParks21 commented 9 years ago

Standing to far from the gates causes the textures on the gates to freak out. Getting close enough to them corrects the glitch. screenshot 37 screenshot 38

gonetz commented 9 years ago

Please test with this binary: https://drive.google.com/file/d/0B0YqMPjGo3B2WWNmd01xcUJ5X2M/view?usp=sharing

If not fixed, please give me savestate from this place.

baptiste0602 commented 9 years ago

Still happen. Savestate : https://drive.google.com/file/d/0ByOA3_oOu6O9UUYzZzgxVmdPU0U/view?usp=sharing

gonetz commented 9 years ago

Thanks!

AmbientMalice commented 8 years ago

I think it's been posted elsewhere, but z64gl has this same issue.

AmbientMalice commented 8 years ago

This is how it looks now. (Blending branch.)

gliden64_donkey_kong_64_000

gliden64_donkey_kong_64_001

olivieryuyu commented 7 years ago

it is fixed, can it be closed?

gonetz commented 7 years ago

Is top screen shot from @AmbientMalice correct? How it looks with angrylions's plugin?

ghost commented 7 years ago

Top screen is wrong. I just tested and it still happens.

It looks like the horizontal bar texture mip maps wrognly.

gonetz commented 7 years ago

Ok, thanks.

oddMLan commented 4 years ago

Still current savestate since the one above got deleted Donkey Kong 64 (U).pj.zip

ghost commented 4 years ago

I've been checking this issue a little bit, it isn't a lod issue but a blender issue.

It only uses 1 texture (CRC -> DONKEY KONG 64#2FF393C9#0#2_all). As you can see, most of the texture is transparent, but this transparent seems to be encoded as #000000 according to my editor, which means black color with 0 alpha.

The problem is in the blender. It think that when youre close to the gate, blending isn't forced (else part in blender shader). Once the player is farther from a certain distance blender is forced and starts to show the wrong picture. The texture blends with the framebuffer (I think) to make a fog effect. However, the transparent pixel's rgb is black and somehow that is mixed with the background.

I tried many things, but I haven't been able to produce the correct picture.

ghost commented 4 years ago

I have analysed the situation further. Apparently I was wrong in certain aspects.

The RDP is in 2 cycle mode.

In cycle one the blend equation is combiner.rgb * combiner.a + memory.rgb * (1.0- combiner.a) so the texture will keep its pixels for the opaque part and use framebuffer (background) colors for the transparent side.
In cycle two the blend equation is blend1.rgb * fog_reg.a + memory.rgb * (1.0-fog_reg.a). The transparent part has memory rgb so the mix will result in memory.rgb. For the opaque part, the combiner.rgb mixes with memory.rgb based on fog register (which I assume it is updated by the CPU as you move away).

For this approach to work memory.rgb has to be updated. I believe that the problem is that both the horizontal and vertical bars use the same framebuffer as memory.rgb and therefore one ocludes the other one.

@gonetz How can I render primitives one by one to check if that solves the problem. Also, why is muxPM[1] = vec4(0.0); in the second cycle blend?

gonetz commented 4 years ago

N64 blender is in fact a part of color combiner. It can blend output of color combiner with fog color, blend color and of course with memory color. It can't be emulated with standard OpenGL blender, which only can blend output color with memory color. This, I invented the blender shader, which does all the blending work but the final blending with memory color, which is impossible to do in shaders without some poorly working extensions. Final blending is still performed by standard OpenGL blender. This is why muxPM[1] = vec4(0.0); in the second cycle blend. This scheme works fine for 99.9% of combiners. It does not work only when memory color is used in the first equation of 2-cycle blender. Unfortunately, as you found, it is the our case. Such cases still need some special hacks.

gonetz commented 4 years ago

How can I render primitives one by one to check if that solves the problem

In HLE mode you can do it like this:

void gSPFlushTriangles()
{
//  if ((gSP.geometryMode & G_SHADING_SMOOTH) == 0) 
    {
        dwnd().getDrawer().drawTriangles();
        return;
    }
...

ghost commented 4 years ago

I realized that if you plug the equation for blender1 in blender2 in the previous case, you still get a linear combination of muxes p and m.

Equation 1: pix.rgb * pix.a + mem.rgb * (1-pix.a)
Equation 2: blend1.rgb * fog.a + mem.rgb * (1-pix.a)
Combined: (pix.rgb * pix.a + mem.rgb * (1-pix.a)) * fog.a + mem.rgb * (1- pix.a)

The latter can be simplified as

pix.rgb * (pix.a * fog.a) + mem.rgb * (1-pix.a*fog.a)

which is a linear combination of colors. Therefore, it is doable with OpenGL's blender,provided that source alpha = pix.a * fog.a is passed.

So I thought, is it possible to isolate memory.rgb and generalize the current blending approach to consider the cases where memory rgb is used in cycle 1?

For simplicity, lets assume that mux p of the second cycle contains the blended color in the first cycle. We would get a combined output color equal to (p1 * a1 + m1 * b1) * a2 + m2 * b2 which can be written p1 * (a1 * a2) + m1 * (b1 * a2) + m2 * b2

This can be expressed as $\sum c_i \cdot \alpha_i + m\sum \alpha_j$ where c_i are the non memory colors, alpha_i their coefficients, m the memory colors and alpha_j their coefficients.

So if we are able to check wich colors among p1, m1, m2 are memory colors (probably m1 and/or m2) and separate the equation it could be possible to blend using OpenGL equation.

The idea would be to define

Source color: $\sum c_i \cdot \alpha_i$ , i runs through the indices where c_i isn't the memory color.
Source factor: 1.0
Destination color: memory color
Destination factor: $\sum \alpha_j$ , where j runs through the indices where c_j is the memory color.

I wrote some WIP code to implement the idea and Donkey Kong renders correctly, so it seems doable. I publish a sneak peak. Most of the logic can probably be taken out of the shader. https://github.com/standard-two-simplex/GLideN64/tree/blender_mem_read_first_cycle

However, many things are wrong yet. A couple of questions @gonetz.

Some of the issues might be because clampcolor.a no longer contains pixel alpha. I probably need to separate destination factor and pixel alpha. Which variable does OpenGL read the source factor from? fragColor.a?
I didn't consider the cases where mux b contains memory alpha. Is this currently supported?
I'm not very sure how the code in GraphicsDrawer::SetBlendMode() is working. Is srcFactor=1 and dstFactor changes depending on the input?

ghost commented 4 years ago

I realize that because master defaults memory.rgb to zeros, my code computes the same rgb as master (except for a bug I made in one of the mux selections). Thus, the only difference is the computation of destination factor and blend mode parameters of OpenGL blending.

Therefore, changes are minimal and can probably be rewritten in a nicer way.

ghost commented 4 years ago

I corrected two major mistakes and added some cleanup. The picture is looking better now.

Code looks similar to master, the change is that destination factor is computed in shaders rather than in setBlendMode. Donkey Kongs bars look good and Bobombs explosion when you send the ball out in Mario Tennis looks correct (which I didn't even know it was wrong). I suspect many games use a similar blend mode.

I need to cleanup and check for regressions. Tony Hawks doesn't work without the hack, which I was expecting it would.

gonetz commented 4 years ago

Sorry for not answering you. I'm busy and can't catch you up. You idea sounds cool. I'll check it as soon as I finish with my urgent tasks.

ghost commented 4 years ago

Don't worry, there is no hurry.

gonetz commented 4 years ago

I'm not very sure how the code in GraphicsDrawer::SetBlendMode() is working. Is srcFactor=1 and dstFactor changes depending on the input?

I already forgot all this blending stuff. As I remember, SetBlendMode() analyzes blender inputs to calculate the destination factor. General blending equation is a p + b m p and m can be memory color. Memory color input replaced by zero in vertex shader. So, for example if m is memory color, blender shader calculates a p + b 0, that is a * p. It is our source color. To complete the blending, we need to use source factor srcFactor = blend::ONE and provide correct destination factor to blend source color with memory color.

I didn't consider the cases where mux b contains memory alpha. Is this currently supported?

Yes. Memory alpha is a valid input for mux b.

switch (muxB) {
...
            case 1:
                dstFactor = blend::DST_ALPHA;
                break;

Some of the issues might be because clampcolor.a no longer contains pixel alpha. I probably need to separate destination factor and pixel alpha. Which variable does OpenGL read the source factor from? fragColor.a?

Sorry, I don't understand the question. Please explain.

ghost commented 4 years ago

Sorry, I don't understand the question. Please explain.

What I mean is, how does OpenGL know which variable to use when GL_SRC_ALPHA or GL_ONE_MINUS_SRC_ALPHA is chosen?

I modified clampedColor.a so if this variable is used anywhere else, it will give incorrect picture.

gonetz commented 4 years ago

What I mean is, how does OpenGL know which variable to use when GL_SRC_ALPHA or GL_ONE_MINUS_SRC_ALPHA is chosen?

As I see, GL_SRC_ALPHA or GL_ONE_MINUS_SRC_ALPHA can't be used as source color factor. The blender shader does all modifications of the source color, so blend factor for source is usually blend::ONE. The only exception is 'memory alpha' factor. It can't be used in the shader, so _setBlendMode() sets it as source factor when necessary, for example:

            if (gDP.otherMode.c2_m2a == 0 && gDP.otherMode.c2_m2b == 1) {
                // c_in * a_mem
                srcFactor = blend::DST_ALPHA;
            }

ghost commented 4 years ago

I realized I had removed game specific hacks wrongly. Now Tony Hawk's Pro Skater 2 looks correct. I added some cleanup and rebased to master before lle changes for testing.

In theory, all blend modes except muxb == memory alpha are emulated. I'm not sure how to emulate such cases in a generic way, taking into account the many possibilities (one or two cycles, using different colors together with memory alpha...). In that case it defaults to 1.0 so background color might be more dominant than intended in the mix.

I don't know how often use this option. It would be good to know where it is used to try to consider those specific cases.

oddMLan commented 4 years ago

@standard-two-simplex if your solution fixes Tony Hawk 2 could you check if it also fixes #1488 (commit https://github.com/gonetz/GLideN64/commit/3b9f16e8ddce81be8121ce0a29d743807b6d7a41) without the hack?

gonetz commented 4 years ago

I realized I had removed game specific hacks wrongly. Now Tony Hawk's Pro Skater 2 looks correct. I added some cleanup and rebased to master before lle changes for testing.

Great!

I don't know how often use this option. It would be good to know where it is used to try to consider those specific cases.

I also don't know where muxb == memory alpha mode is used. I don't want regressions here, so I suggest to use my current blending method for this case.

ghost commented 4 years ago

@oddMLan Yes, A Bug's Life seems correct too. I see Mia Ham Soccer and Blast Corps also have game specific code, but I'm not sure what to look for.

ghost commented 4 years ago

I also don't know where muxb == memory alpha mode is used. I don't want regressions here, so I suggest to use my current blending method for this case.

Done. Better safe than sorry.

What I mean is, how does OpenGL know which variable to use when GL_SRC_ALPHA or GL_ONE_MINUS_SRC_ALPHA is chosen?

I think I didn't make myself very clear about this. The problem is the following:

In 4f3ed29b85d45ddbd548a7a528cfbd0264222dca, I set clampColor.a = fd2 and then pick dstFactor = SRC_ALPHA which I think results in fd2 because the shader is outputing that. Almost everything works correctly but some things are missing (Bobombs in Mario Tennis for example).
In 405ee3384d09b484bf43ddc3f3c8938f8e585533, I set clampColor.a = 1.0 - fd2 and then pick dstFactor = ONE_MINUS_SRC_ALPHA, which still resolves as fd2 because fd2 = (1.0 - (1.0 - fd2)). Here Bobombs also look correctly.

In theory, both options should blend equally. However, 1.0-fd2 is closer to original pixel alpha, so I was thinking perhaps alpha compare or another piece of code that uses clampColor.a might interfere.

What I wanted to know was what the output variable of the fragment shader is, and if it is possible to assign fd2 to the output later in the fragment shader when the original alpha value is no longer needed.

gonetz commented 4 years ago

What I wanted to know was what the output variable of the fragment shader is, and if it is possible to assign fd2 to the output later in the fragment shader when the original alpha value is no longer needed.

Lets check generated fragment shader for a 2-cycle combiner. Only main() part:

Initialization


...
OUT lowp vec4 fragColor;    
...
void main() 
{            
highp float fragDepth = writeDepth(); 
lowp vec4 vec_color, combined_color;      
lowp float alpha1, alpha2;                
lowp vec3 color1, color2, input_color;    
lowp vec4 shadeColor = uScreenSpaceTriangle == 0 ? vShadeColor : vShadeColorNoperspective;    
#define WRAP(x, low, high) mod((x)-(low), (high)-(low)) + (low) 
lowp mat4 muxPM = mat4(vec4(0.0), vec4(0.0), uBlendColor, uFogColor); 
lowp vec4 muxA = vec4(0.0, uFogColor.a, shadeColor.a, 0.0);               
lowp vec4 muxB = vec4(0.0, 1.0, 1.0, 0.0);                                
texCoord0 = clampWrapMirror(vTexCoord0, uTexClamp0, uTexWrap0, uTexMirror0, uTexScale0);  
texCoord1 = clampWrapMirror(vTexCoord1, uTexClamp1, uTexWrap1, uTexMirror1, uTexScale1);  
lowp vec4 readtex0, readtex1;


So, output variable of the fragment shader is fragColor

2. 1st cycle of the color combiner:

lowp float lod_frac = mipmap(readtex0, readtex1); input_color = shadeColor.rgb; vec_color = vec4(input_color, shadeColor.a); alpha1 = mix(readtex0.a,readtex1.a,lod_frac); alpha1 = WRAP(alpha1, -0.51,1.51); if (uEnableAlphaTest != 0) {
lowp float alphaTestValue = (uAlphaCompareMode == 3) ? snoise() : uAlphaTestValue;
lowp float alphaValue;
if ((uAlphaCvgSel != 0) && (uCvgXAlpha == 0)) { alphaValue = 0.125;
} else {
alphaValue = clamp(alpha1, 0.0, 1.0);
}
if (alphaValue < alphaTestValue) discard;
}
color1 = mix(readtex0.rgb,readtex1.rgb,vec3(lod_frac)); color1 = WRAP(color1, -0.51, 1.51); combined_color = vec4(color1, alpha1);


Note: alpha test performed after the first cycle. If it fails, fragment discarded.

3.  2nd cycle of the color combiner

alpha2 = (combined_color.a )vec_color.a; if (uCvgXAlpha != 0 && alpha2 < 0.125) discard; color2 = (combined_color.rgb )vec_color.rgb; lowp vec4 cmbRes = vec4(color2, alpha2); lowp vec4 wrappedColor = WRAP(cmbRes, -0.51, 1.51); lowp vec4 clampedColor = clamp(wrappedColor, 0.0, 1.0);

4. Dithering

if (uColorDitherMode == 2) {
colorDither(snoiseRGB(), clampedColor.rgb);
}
if (uAlphaDitherMode == 2) {
alphaDither(snoiseA(), clampedColor.a);
}


5. 1st cycle of the blending

define MUXA(pos) dot(muxA, STVEC(pos))

define MUXB(pos) dot(muxB, STVEC(pos))

define MUXPM(pos) muxPM*(STVEC(pos))

muxPM[0] = clampedColor;
if (uForceBlendCycle1 != 0) {
muxA[0] = clampedColor.a;
muxB[0] = 1.0 - MUXA(uBlendMux1[1]);
lowp vec4 blend1 = MUXPM(uBlendMux1[0]) MUXA(uBlendMux1[1]) + MUXPM(uBlendMux1[2]) MUXB(uBlendMux1[3]); clampedColor.rgb = clamp(blend1.rgb, 0.0, 1.0); } else clampedColor.rgb = (MUXPM(uBlendMux1[0])).rgb;


6. 2nd cycle of the blending

muxPM[0] = clampedColor;
muxPM[1] = vec4(0.0);
if (uForceBlendCycle2 != 0) {
muxA[0] = clampedColor.a;
muxB[0] = 1.0 - MUXA(uBlendMux2[1]);
lowp vec4 blend2 = MUXPM(uBlendMux2[0]) MUXA(uBlendMux2[1]) + MUXPM(uBlendMux2[2]) MUXB(uBlendMux2[3]); clampedColor.rgb = clamp(blend2.rgb, 0.0, 1.0); } else clampedColor.rgb = (MUXPM(uBlendMux2[0])).rgb; fragColor = clampedColor;


Note - result is stored in fragColor after the blending.

7. Manipulation with fragment depth

if (uRenderTarget != 0) {
if (uRenderTarget > 1) {
ivec2 coord = ivec2(gl_FragCoord.xy); if (fragDepth >= texelFetch(uDepthTex, coord, 0).r) discard;
}
fragDepth = fragColor.r;
}
gl_FragDepth = fragDepth; } // end of main()



Full shader attached:
[FragmentShader.txt](https://github.com/gonetz/GLideN64/files/4585904/FragmentShader.txt)

ghost commented 4 years ago

Thanks. It is very clear how it works now. Two questions.

I am unable to find gliden64.log files both with debug and release builds. I think I have write permission issues. Is the directory it is written to always the same as the plugin directory? Or can it be changed by the parameters passed to mupen64plus?
Is it possible to easily dump a file like the one you attached above?

gonetz commented 4 years ago

Is the directory it is written to always the same as the plugin directory?

Yes

Or can it be changed by the parameters passed to mupen64plus?

You may change sources and write the log wherever you need.

Is it possible to easily dump a file like the one you attached above?

Easily? I just set break point to the line const GLchar * strShaderData = strFragmentShader.data(); in src\Graphics\OpenGLContext\GLSL\glsl_CombinerProgramBuilder.cpp and then dump content of strShaderData. There is also Utils::logErrorShader, which dumps the shader into Log file:

    if (!Utils::checkShaderCompileStatus(fragmentShader))
      Utils::logErrorShader(GL_FRAGMENT_SHADER, strFragmentShader);

You may dump all shaders unconditionally.

ghost commented 4 years ago

I figured out why 4f3ed29b85d45ddbd548a7a528cfbd0264222dca doesn't show framebuffer effects.

Like you pointed out, the blender is the last stage where pixel alpha is used. Thus, fragColor.a is set after blending and not modified or used further in the shader.

So if blending happens equally, why do some objects not show up? Because they use framebuffer based textures. The alpha value is used not only for blending, but also for writing it to the framebuffer. This alpha channel isn't used to render the framebuffer to the screen, but it is used to create textures from auxiliary buffers. Therefore, textures created from framebuffer objects use this value as their alpha.

Currently the following values are written to the alpha channel of the framebuffer.

Master: alpha output from the color combiner, coming from clampedColor.a
My implementation at 405ee33: opposite of the weight of memory color, computed as 1.0 - fd2
My implementation at 4f3ed29: the weight of memory color, computed as fd2.
A real N64: not sure at all, the N64 programming manual isn't very clear. I would suspect it is one of coverage, combined alpha or coverage times alpha from this and this.

In 4f3ed29, fd2 mostly resolves to 1.0 - combined alpha, therefore most of the framebuffer effects are invisible. In 405ee33, 1.0 - fd2 mostly resolves to combined alpha, so there aren't many issues (I haven't found any). However, when the blender mode is a little crafty, it should resolve into something else, so it is a potential source for issues.

This means that, in order to be safe, I need to output a source color fragColor.rgb and a destination factor fd2 for the blend equation, and combined alpha clampedColor.a to store in the framebuffer, which is more than the standard output of the fragment shader.

Fortunately, OpenGL has the resources to do this (dual source blending and glBlendFuncSeparate() in here), but I'm really bad at setting up these.

olivieryuyu commented 4 years ago

I would refer to this as well.

1158

ghost commented 4 years ago

I wrote an initial implementation. It should work if the extension is supported. One curious thing is that the explosion effects in Mario Tennis no longer works. The blent rgb value is the same (in theory) so it must be due to the alpha channel being different.

I would refer to this as well.

It gives some clues, but I would need something more concrete. Probably angrylions sources are the best bet, but sometimes they are difficult to read.

ghost commented 4 years ago

@gonetz Is it possible to access booleans in struct GLInfo in opengl_GLInfo.h? I would need this to tidy up the use of the dual source extension.

gonetz commented 4 years ago

I don't understand the question. What do you mean by "access booleans in struct GLInfo"? All members of that structure are public and accessible. If you need to add another flag to that structure, you can do it of course.

ghost commented 4 years ago

I don't understand the question.

Sorry, I didn't explain myself. I need to access them from GraphicsDrawer::setBlendMode(), but I dont have a glInfo pointer as in most glsl_CombinerProgramBuilder.cpp functions.

ghost commented 4 years ago

I cleaned up a little bit and renamed the branch into https://github.com/standard-two-simplex/GLideN64/tree/blender_changes

I need to do

if (_glinfo.blend_func_extended)
    dstFactor = blend::SRC1_ALPHA;
else
    dstFactor = ONE_MINUS_SRC_ALPHA;

but I don't know how to call blend_func_extended from here.

I think that, other than that, color blending changes are finished.

A real N64: not sure at all, the N64 programming manual isn't very clear. I would suspect it is one of coverage, combined alpha or coverage times alpha from this and this.

I checked angrylions code about this. I believe that the code in finalize_spanalpha() is responsible for alpha blending, which is not emulated in GLideN64. I will open a separate issue about it.

gonetz commented 4 years ago

Fixed now. @standard-two-simplex thanks!

gonetz / GLideN64

Donkey Kong 64: Gate textures change from a distance. #585

define MUXA(pos) dot(muxA, STVEC(pos))

define MUXB(pos) dot(muxB, STVEC(pos))

define MUXPM(pos) muxPM*(STVEC(pos))

1158