flibitijibibo / FNA-MGHistory

FNA - Accuracy-focused XNA4 reimplementation for open platforms
http://fna-xna.github.io/
246 stars 37 forks source link

[Driver] Unexplained Intermittent Graphical Bug ... #294

Closed AxiomVerge closed 9 years ago

AxiomVerge commented 9 years ago

I get the following bug intermittently.

bug

It seems to happen most often when switching window size, which I do with

DeviceManager.PreferredBackBufferWidth = windowWidth;
DeviceManager.PreferredBackBufferHeight = windowHeight;
DeviceManager.ApplyChanges();

For instance, in the above image, I didn't see the artifacts at 1x or 2x, but at 3x and up. Sometimes they never appear. Sometimes they appear only at 4x and up. My renderer works by drawing the game at 480 x 720, applying bloom, rendering the UI to as separate render target, then adding that target atop the bloomed scene. The only time the scaling comes into play is the last step where I render the 480 x 270 scene to the backbuffer, which looks like this:

 // clear backbuffer to get black bars
mDeviceManager.GraphicsDevice.Clear(ClearOptions.Target, Color.Black, 1.0f, 0);

// draw a quad to get the draw buffer to the back buffer
mSpriteBatch.Begin(SpriteSortMode.Immediate, BlendState.Opaque, SamplerState.PointClamp, DepthStencilState.None, RasterizerState.CullNone);
mSpriteBatch.Draw(sourceRenderTarget, dst, Color.White);
mSpriteBatch.End();

However, the artifacts scale with the source 480 x 270 issue so they must be part of that. I discovered by accident that if I clear my 480 x 270 scene before rendering it, the purple color changes to whatever color I cleared with, leading me to think maybe there is a stencil issue? I've only ever used DepthStencilState.None so this is a difficult thing for me to check.

I'm kind of expecting this to be some obtuse shader param order or vertex format issue (most of my shader issues getting it to work with Monogame PS4 had to do with that, as it was much less forgiving then Microsoft's hlsl at first). At any rate this is the most info I've gleaned so far.

flibitijibibo commented 9 years ago

apitrace may be able to shed some light on where the artifacts are being drawn and what data was in the buffers before that draw call:

http://apitrace.github.io/

Test cases help as well (or if needed, just a copy of the game I can trace around with).

AxiomVerge commented 9 years ago

Okay, I ran the apitrace a few times and eventually was able to catch the bug in progress.

Interestingly, while I don't see the bug in the apitrace playback, when I hit ctrl+T, I can see a thumbnail that looks like a corrupted texture in the 2nd to last glClear() of the frame.

In the code, I determined that this is the clear that happens when I set the rendertarget to null (i.e. the backbuffer), and have presentation parameters set to DiscardContents:

GraphicsDevice.cs line 686:

                if (PresentationParameters.RenderTargetUsage == RenderTargetUsage.DiscardContents)
                {
                    Clear(DiscardColor);
                }

I tried to go into the Clear() code and see if there was anything unusual about it,but couldn't find anything. This is what apitrace says:

5175 @0 glClear(mask = GL_DEPTH_BUFFER_BIT | GL_STENCIL_BUFFER_BIT | GL_COLOR_BUFFER_BIT)

Either skipping past the clear in the debugger or just setting my presentation parameters to something other than DiscardContents seems to resolve the issue.

Here's a link to the trace; I used Frame 21 to actually find the call.

https://www.dropbox.com/s/5e0jnlrqa6k631a/AxiomVerge.trace?dl=0

I can tell by the render target thumbnail that it looks messed up, but clicking on the thumbnail causes the thumbnail viewer window to crash and " caught an unhandled exception" to appear in the QApiTrace error window. Weirdly you can see that I immediately clear it afterwards with black, and this appears to work in ApiTrace, but, running the executable, it is like those purple blocks are "unwritable" or something and appear after the rest.

I doubt this is it but I once had a similar problem in directX where PIX wouldn't reproduce the buggy output of a game; it turned out to be one of the underlying libraries didn't fence its draw calls and they wound up stomping over one another with some very weird results (like triangles being drawn between the wrong vertices, etc.). But since PIX only records the api calls and doesn't replicate the game's fencing logic ( presumably uses its own), it would display perfectly when you tried to debug the issue.

flibitijibibo commented 9 years ago

Looked at the trace on my setup and sure enough, it seems to look okay! But I noticed that there are some AMD GL extensions in the list... is this an AMD graphics card? For some reason the glGetString calls aren't in there so I can't tell exactly what's being run, but it sounds like an AMD card with a possibly old driver (given that it didn't seem to have EXT_swap_control_tear either).

The main difference between our clear call and your clear call is that we call depth/stencil clears even though it's a color-only buffer. So you might actually be right - this could be some weird stencil issue as a result of that.

This ended up being a good excuse to do some cleanup in SetRenderTargets, so I did that and also added a check for the DepthFormat to see if we should be clearing depth/stencil:

https://github.com/flibitijibibo/FNA/commit/0c5654f4e36cd8801ed7de44d35c4aced6a1f1c7

This might fix it, but with AMD you can never be 100% sure until it's working in front of you.

flibitijibibo commented 9 years ago

Did some more work on this; XNA4 actually checks the current target depth format, so we will too. This should retain the fix I made, and it turns out it's more accurate anyway.

https://github.com/flibitijibibo/FNA/commit/a5f12fe3355012ffdaa0d54507e53ce96aa9e37e

AxiomVerge commented 9 years ago

Sorry for the delay, yesterday was PR overflow.

My driver is indeed old; I have an old Vaio laptop with an ATI card from 2010 that only accepts the special vaio-flavored drivers from Sony, which haven't been updated since then.

I synced your changes; these alone did not fix it. However, I stepped through the clear, and discovered that my PresentationParameters.DepthStencilFormat was set to Depth24. I'd actually never known this setting existed, but, it makes sense because without it 3D games would fail to work by default. So I set it to None and now it works. I presume it should have worked the other way were it not for my bad drivers (just the depth buffer would be blank), but, this way I'm being more explicit about it.

flibitijibibo commented 9 years ago

Huh. Well, glad it works in some form; wonder why it freaks out over a glClear(GL_DEPTH_BUFFER_BIT)... It could be that the faux-backbuffer's depth/stencil buffer is a texture rather than a renderbuffer, but that could be applicable to any render target as well.

On the subject of optimization: since you do the scaling internally in the XNA4 game, you might be able to optimize things for your FNA version by toggling this define:

https://github.com/flibitijibibo/FNA/blob/master/src/Graphics/OpenGLDevice.cs#L11 https://github.com/flibitijibibo/FNA/blob/master/src/Graphics/OpenGLDevice_GL.cs#L11

Basically as long as your fullscreen resolution is always the desktop resolution, this will skip a scaled framebuffer blit without any visual changes. The device backbuffer might even work with Depth24 as well, if my guess is correct at all.

Will close this along with a [Driver] tag, just in case this emerges from the depths again.