gonetz / GLideN64

A new generation, open-source graphics plugin for N64 emulators.
Other
775 stars 180 forks source link

Various regressions since PR2.0 with N64 depth compare enabled [AMD Issue?] #1418

Closed oddMLan closed 7 years ago

oddMLan commented 7 years ago

imagen

Just start a new game and watch the intro. Using latest hotfix from #1364

This is a recent regression, it doesnt happen with build e8c47c0 (public release 2.0)

gonetz commented 7 years ago

I can't reproduce it. The game works for me as usual. May be you use non-standard RSP plugin?

oddMLan commented 7 years ago

Tested with both the RSP that comes with the emulator and with cxd4's RSP. The problem doesn't seem to go away. Only tested with PJ64 2.3.x though, could it be a core regression? Public Release 2.0 doesn't seem to have any problems.

I'm thinking one of the functions refactored to use OpenGL freely might be having trouble with my graphics card. AMD Radeon R7 250 series

oddMLan commented 7 years ago

It doesn't happen with Mupen64plus build. Closing

oddMLan commented 7 years ago

Oddly, it doesn't happen in LLE with Project64... hmm

gonetz commented 7 years ago

I'm thinking one of the functions refactored to use OpenGL freely might be having trouble with my graphics card. AMD Radeon R7 250 series

Hardly. Polygons on your screen shot look very incorrect. Either ucode detected incorrect (wrong rom version) or problem with rsp.

oddMLan commented 7 years ago

Nice catch. Culprit was the PJ64's rsp after all. I'll make an issue for them. I got confused since it seemed to be wrong with cxd4's rsp too? what even. now it's not the wrong result anymore

Nevermind, I forgot to toggle "send DLs to gfx plugin". HLE still looks wrong with cxd4 Also, HLE runs slower than LLE.

oddMLan commented 7 years ago

Problem is N64 depth compare. Reopening Had to remove GlideN64.custom to realize what was wrong.

Reproduced with Mupen64plus + hle rsp + N64 depth compare enabled imagen it doesn't happen with LLE rsp

this is problematic since N64 depth compare is needed for the shadows to cast properly and it's set in GLideN64.custom.ini

oddMLan commented 7 years ago

N64 depth compare works properly in this game with Public Release 2.0. Confirming this is a regression in N64 depth compare

gonetz commented 7 years ago

Please check mupen64plus build.

oddMLan commented 7 years ago

Of what? Public release 2.0?

I tested mupen64plus above with build ccb914e and this time with N64 depth compare enabled (when I said it was correct before, I didn't have it enabled because I didn't have the custom ini with the setting) and it gave the same wrong result.

gonetz commented 7 years ago

Of what? Public release 2.0?

Latest master. I don't see problems with N64 depth compare in that game.

gonetz commented 7 years ago

Also, please check for gliden64.log file. It may contain records about problems with shaders compilation.

oddMLan commented 7 years ago

imagen

all I have in the log is OpenGL Error: invalid enumerant (500)

oddMLan commented 7 years ago

Could commit 7c51f94 make a difference? I don't have a build for that, or am able to build it. I wasn't able to build the latest qt version.

gonetz commented 7 years ago

Is shader cache disabled?

gonetz commented 7 years ago

Could commit 7c51f9 make a difference?

No.

oddMLan commented 7 years ago

Is shader cache disabled?

Yes. I made sure to delete every trace of cache before testing, also.

gonetz commented 7 years ago

I have no more ideas. I'll check it on another PC later.

olivieryuyu commented 7 years ago

I confirm the bug

gonetz commented 7 years ago

I checked with another card, no problems with master build. Default settings + N64 depth compare.

oddMLan commented 7 years ago

Hm... if @olivieryuyu can reproduce... what else could it be? A driver bug? Gonetz, did you test with an AMD card?

Could a debug build of GLideN64 be more verbose about errors?

gonetz commented 7 years ago

Gonetz, did you test with an AMD card?

Win10 AMD, Linux NVidia.

May be it is a special place in the intro? I checked only the first demo, in subway.

gonetz commented 7 years ago

Logan made a fix, which can fix the problem: 07a84a9973176 Test build (disable shader storage before run) : https://drive.google.com/file/d/0B0YqMPjGo3B2VTlmVWt0b0hGSFU/view?usp=sharing

oddMLan commented 7 years ago

No difference for me :( Yes, I made sure to disable shader storage

what about you @olivieryuyu ?

May be it is a special place in the intro?

I just let the men walk in and when the camera shows the whole place it freaks out. If you press start to skip the intro the problem doesn't become apparent inmediately but try moving around and it will eventually freak out too.

AmbientMalice commented 7 years ago

I'm on Nvidia (GTX 750Ti), and having no issues. Maybe this is an AMD issue?

gonetz commented 7 years ago

@oddMLan Have you tried to run other games with N64 depth compare enabled?

oddMLan commented 7 years ago

I've ran about every other game with N64 depth compare seeing no issues. But I had a hunch the issue seems to trigger only when there's a high amount of polygons on screen (Factor 5 games like TWINE are used to do that).

So I thought I could probably run one of the most demanding poly-count wise... guess that seems to be it imagen

... So far I've been only able to see this just on the part when Conker walks into the bar. Nowhere else

gonetz commented 7 years ago

Does Release 2.0 work ok there?

oddMLan commented 7 years ago

Yes it does. It's super slow, but it works accurately

EDIT: Also with N64 depth compare I get like.... no 3D graphics in Majora's Mask. No spinning N64 logo. No spinning mask. Just skybox. In public release 2.0 it works just fine with N64 depth compare... OOT seems to run fine, however wtf is going on

gonetz commented 7 years ago

Hmm. N64 depth compare works ok for some games, and have serious problems with others. Release 2.0 works ok with all these games. I have one guess, please check this build:

https://drive.google.com/file/d/0B0YqMPjGo3B2U2toSFVBYlJPQ0E/view?usp=sharing

oddMLan commented 7 years ago

imagen

imagen

TWINE and Conker fixed, good job!!

oddMLan commented 7 years ago

imagen imagen

However, Majora's Mask won't budge 😟

gonetz commented 7 years ago

It is not a good job. It is just a good guess. I disabled drawer, which uses VBO. Now I need to guess, why VBO drawer causes that problem.

Also, it is interesting, why unbuffered drawer works incorrect with Majora's Mask

oddMLan commented 7 years ago

Maybe it is just a driver issue with my AMD card and @olivieryuyu 's GFX card Can anyone test an Android build to see if the same happens in these scenarios? @fzurita ?

EDIT: Even though you disabled that, Majora's mask it's still glitched for me (made sure driver storage was disabled). That seems to be a different issue, do you get correct graphics in Majora's Mask with N64 depth compare?

gonetz commented 7 years ago

@oddMLan This is an attempt to fix the issue with buffered drawer:

https://drive.google.com/file/d/0B0YqMPjGo3B2bnE4U2paSF9xVFU/view?usp=sharing

Please check it with TWINE and Conker

oddMLan commented 7 years ago

Nooope. Still facing the same messed up polygons 😞

oddMLan commented 7 years ago

Would it help to check my GFX card for any particular OpenGL extension support?

gonetz commented 7 years ago

No. Plugin already does all checks. N64 depth compare enabled only when necessary extensions are supported, otherwise standard depth compare is used. Btw, please be sure that AA is disabled. However, AA can't cause such issues. Something wrong with synchronization between shaders invocations. I put glMemoryBarrier everywhere, but it does not help in your case. I also use AMD card, but don't have that problem. I have no more ideas :(

oddMLan commented 7 years ago

I had everything as close as default settings, so nothing besides N64 depth compare and Shader storage was toggled. I feel bad now for wasting your time, does this mean we've stumbled upon another Heisenbug? I should start testing the VI refactor branch more, though surprisingly I barely see any regressions there. I guess that is worthy of a "good job" compliment 😆

oddMLan commented 7 years ago

Despite all this, none of these polygon freakouts seem to be happening in LLE mode (except for the Majora's Mask issue)... why is that?

olivieryuyu commented 7 years ago

I have a nvidia card gtx 980.

oddMLan commented 7 years ago

Damn! So it's not vendor-specific. Does it looks better or worse than my screenshots, for you?

gonetz commented 7 years ago

I feel bad now for wasting your time, does this mean we've stumbled upon another Heisenbug?

Sort of. At least we found the part of code, which is responsible for that issue. I don't understand what's wrong though. I tested it on gtx 980 too, but only on Linux - works fine. Driver issue? The same issue for NVidia and AMD?

I guess that is worthy of a "good job" compliment

Thanks :) I worked on that branch almost month before make it public.

none of these polygon freakouts seem to be happening in LLE mode (except for the Majora's Mask issue)... why is that?

Can't say for sure. The difference between HLE and LLE on drawer level is that LLE uses glDrawArrays, while HLE uses glDrawElementsBaseVertex. Looks like elements buffer is not updated properly, which resulted in wrong order of vertices to draw and thus that mess.

loganmc10 commented 7 years ago

Can't say for sure. The difference between HLE and LLE on drawer level is that LLE uses glDrawArrays, while HLE uses glDrawElementsBaseVertex. Looks like elements buffer is not updated properly, which resulted in wrong order of vertices to draw and thus that mess.

I would say it's probably a driver bug, except that glDrawElementsBaseVertex is used without N64 Depth Compare as well, so if there was something wrong with executing that command, you would think the issue would be present all the time.

loganmc10 commented 7 years ago

@oddMLan

all I have in the log is OpenGL Error: invalid enumerant (500)

That's not insignificant, there shouldn't be any GL errors. The error can be tracked down using 2 methods:

loganmc10 commented 7 years ago

I was able to reproduce the issue with Windows + Nvidia (mupen64plus). It only happens for a brief second during the intro, at the point where @oddMLan took the screenshot, I don't see any GL errors generated.

fzurita commented 7 years ago

I see that we are using GL_MAP_UNSYNCHRONIZED_BIT in a few places with glMapBufferRange. I think this could be a little dangerous.

loganmc10 commented 7 years ago

GL_MAP_UNSYNCHRONIZED_BIT is only used if ARB_buffer_storage isn't available, which is quite uncommon on desktops nowadays. But ARB_buffer_storage does basically the same thing as GL_MAP_UNSYNCHRONIZED_BIT, there is no protection against overwriting useful data.

At the top of BufferedDrawer.cpp is:

const u32 BufferedDrawer::m_bufMaxSize = 4194304;

Which sets the size of the buffers obviously, once it hits that number, it starts over at 0. I'll increase that number and see if it solves the problem.

loganmc10 commented 7 years ago

Indeed, it seems that changing

const u32 BufferedDrawer::m_bufMaxSize = 4194304;

to

const u32 BufferedDrawer::m_bufMaxSize = 4194304 * 2;

in opengl_BufferedDrawer.cpp solves the problem for me. Perhaps someone else could verify that @oddMLan @olivieryuyu

If that is indeed the case, there are 2 solutions: increase the max buffer size, or implement fence syncing to ensure you don't overwrite useful data. Increasing the buffer size will obviously use more video memory, and fence syncing will likely increase the load on the GPU, but I don't know how much of a difference that makes.

oddMLan commented 7 years ago

@loganmc10 when replaying the apitrace, I get this

D:\Users\Kim\Desktop\M64Py\apitrace>apitrace replay m64py.trace 1849: message: major api error 1001: glEnable parameter <cap> has an invalid enum '0x19262' (GL_INVALID_ENUM) 1849 @1 glEnable(cap = GL_RASTER_POSITION_UNCLIPPED_IBM) 1849: warning: glGetError(glEnable) = GL_INVALID_ENUM Rendered 1212 frames in 3.29663 secs, average of 367.648 fps

I get no errors with PJ64. So probably Mupen64plus (or even M64Py) is at fault here.