icculus / mojoshader

Use Direct3D shaders with other 3D rendering APIs.
https://icculus.org/mojoshader/
zlib License
152 stars 37 forks source link

SPIR-V Testing Thread #15

Closed flibitijibibo closed 4 years ago

flibitijibibo commented 5 years ago

Post SPIR-V test results here. Use this archive to test. Some games may be super old and will not work even with stock FNA releases, don't expect much from games that haven't been updated in 4+ years. Repo containing SPIR-V changes is here.

Environment variables of interest:

Template/Example Report:

OS: Fedora 30 x86_64 Graphics: NVIDIA GTX 770, version 430.26 Games Tested:

Breakages (with logs):


Platinum:

Gold:

Known not to work:

Untested:

TheSpydog commented 5 years ago

OS: Windows 10 Pro x86_64 Graphics: NVIDIA GeForce GTX 1060, version 431.60 Games Tested:

flibitijibibo commented 5 years ago

FEZ Windows is probably out of date, even on beta... Linux is the only working build for now.

Simply Chess is probably in the same boat as Murder Miners, will have to compare the asm to be sure.

NVIDIA debug mode does produce perf warnings for shader recompiles, so that’s normal. Release versions skip the debug context flag.

TheSpydog commented 5 years ago

OS: Windows 10 Pro x86_64 Graphics: Intel HD Graphics 620, version 26.20.100.7000 Games Tested:

Breakages:

flibitijibibo commented 5 years ago

Chasm I believe now depends on the accurate behavior, disabling spirv will verify this. If not I’ll check with James...

As for the other two... looks like we’ve got some gooduns, @krolli!

krolli commented 5 years ago

I pushed vpos thing, which should help Chasm (its CRT effect was among those changed by this fix). While doing that, I also noticed vface was just wrong (if generated, it didn't pass validation) and because of some dubious logic to determine whether input is vpos or vface, it went unnoticed before I looked into the vpos.

I'll take a look at Simply Chess next, since its free. What about the others? We can either do renderdoc captures, or I can just grab them from Steam as I get to them.

flibitijibibo commented 5 years ago

Let's try for RenderDoc captures by default, but depending on how busted things are rdoc might crash, so that'll be when we have to dig up the game for real...

Will update with the VPOS fixes later today.

TheSpydog commented 5 years ago

Just tested with the latest.

The Chasm CRT Effect doesn't have a flipped viewport anymore, but now it's got a different problem. This is me toggling between the off/on settings: chasm_intel_new_spirv

EDIT: I've attached a zip containing two captures, one without the CRT effect and one with it. (I'm not 100% sure if these contain all the necessary data, since all the captured textures are showing up as solid black on my PC. Might be a bug with RenderDoc?) These were captured with the latest nightly build, 8/18.

chasm_spirv_captures.zip

flibitijibibo commented 5 years ago

fnaSPIRV.tar.bz2 has been updated with the latest, sorry that's late...

krolli commented 5 years ago

Hmm, I went through much of the glsl and spirv version of crt shader (though it's pretty damn long and mind-numbingly tedious to compare) and I didn't notice anything different. Guessing from what it looks like, the y coordinate for sampling of render target (is that what it's doing?) is either outside of 0-1 range, or it's same value for all fragments in a column (possibly 0). Maybe the vposFlip uniform is not set correctly.

Unfortunately, those captures seem to have zeroes everywhere, including uniforms. Can you try with stable release of RenderDoc? It certainly seems like a bug and I would be rather suspicious of nightly builds.

krolli commented 5 years ago

Looking at the capture you sent me a while ago (which has resource data as well), it seems that render target sampler has wrap mode set to CLAMP_TO_EDGE. It definitely seems to be something to do with Y adjustment of vpos. My primary suspect is definitely vposFlip uniform, probably not set at all.

krolli commented 5 years ago

@TheSpydog I found possible cause of CRT effect going haywire and pushed the fix. Location for vpos uniform was not offset to account for locations taken up by vertex shader, so it was probably filled with incorrect values. Hope it will works this time.

krolli commented 5 years ago

Also, when I wrote some simple test that used the shader, I got GL_INVALID_OPERATION error generated. Wrong component type or count. . Maybe this is the same issue as you had on Simply Chess when you clicked on something.

TheSpydog commented 5 years ago

Cool, just tested and Chasm is back to being merely upside-down in crt mode. So that's progress at least. Also coaxed a little more info out of Simply Chess. When I click on a chess piece, this pops up:

Assertion failed!
Program: mojoshader.DLL
mojoshader_opengl.c
Line 614
Expression: table->samplers[idx].offset
krolli commented 5 years ago

Pushed fix for the assert.

As for CRT, I am at a loss. Strange thing is, the capture you sent me a while ago (before vpos) which was supposed to have the upside-down problem, shows up correctly in RenderDoc for me. At least, the final result is correct. Most of the rendering is actually done upside-down and only at the end it all gets flipped back.

chasm-mid chasm-end

flibitijibibo commented 5 years ago

That makes some sense given the wacky coordinate system stuff, but I swear we sync'd this up with the latest vpos fix... if FNATEST_DISABLE_SPIRV=1 produces the correct image it's in the emitter, otherwise it's in Chasm (I'll go bother James about it).

krolli commented 5 years ago

Those screenshots are from a month-or-so old captures. Not sure what it looks like right now, since newer captures pretend all resources are just zeroes.

BTW, any idea whether Simply Chess is supposed to work with FNA_OPENGL_FORCE_CORE_PROFILE? I keep getting GL error even with GLSL.

System.InvalidOperationException
  HResult=0x80131509
  Message=GL_INVALID_OPERATION error generated. Invalid VAO/VBO/pointer usage.
    Source: GL_DEBUG_SOURCE_API
    Type: GL_DEBUG_TYPE_ERROR
    Severity: GL_DEBUG_SEVERITY_HIGH
  Source=FNA
flibitijibibo commented 5 years ago

Looks like they use client arrays, so that one's out for Core Profile testing unfortunately :/

TheSpydog commented 5 years ago

FNATEST_DISABLE_SPIRV=1 does indeed produce the correct image for Chasm CRT. I've attached another capture (using RenderDoc 1.4 stable). Maybe this time it will show something interesting?

chasm_spirv.zip

krolli commented 5 years ago

Sadly, nothing. I can see source textures, but all uniforms are just zeros, which (probably because of broken transform) causes all draw calls to collapse into single point and render nothing. I see a bunch of errors and warnings in log (many times repeated):

RDOC 013552: [06:50:53]        gl_common.cpp( 706) - Warning - FBOs are shared on this implementation
RDOC 013552: [06:51:00]        gl_postvs.cpp( 439) - Error   - Failed to fix-up. Link error making xfb vs program: Link info
RDOC 013552: [06:51:00]        gl_postvs.cpp( 439) - Error   - ---------
RDOC 013552: [06:51:00]        gl_postvs.cpp( 439) - Error   - error: Varying (named vs_o0) specified but not present in the program object.
RDOC 013552: [06:51:35]        gl_replay.cpp( 339) - Warning - Attempting to read off the end of the buffer (12 196608). Will be clamped (196608)
RDOC 013552: [06:51:35]        gl_replay.cpp( 339) - Warning - Attempting to read off the end of the buffer (16 196608). Will be clamped (196608)

I don't really know whether this is something related to spirv emitter, or some other bug causing RenderDoc to fail to capture uniforms. I can only humbly ask for help from @baldurk.

baldurk commented 5 years ago

That's a fun one. I think @krolli you're using RenderDoc v1.4 which didn't properly handle SPIR-V GL shaders in some situations like fetching mesh output, because it relied on driver reflection and GL transform feedback instead of patching the SPIR-V by hand. I made a change a little while ago to fix that.

The uniforms are 0 because the capture was made on a recent build of RenderDoc where unfortunately I broke SPIR-V reflection of loose global uniforms (i.e. not in a UBO). I've pushed a fix that should get it working again, but you'll need to recapture and replay on the next nightly build to get everything working now.

krolli commented 5 years ago

Well, that would explain a lot. I guess we have another round of capturing ahead of us. :)

TheSpydog commented 5 years ago

Here's the Chasm CRT and Celeste dash effects captured with the latest nightly build:

chasm_celeste_captures.zip

krolli commented 5 years ago

So, Chasm still looks correct on my end, though it seems rather dark. chasm

Can you open the capture on PC where it's wrong and see whether final result is flipped upside down? Also check whether first color pass ends at EID 52 with the screen upside-down or correct. For me, it is upside-down and gets flipped back in third color pass (at EID 167). All draw calls in capture have vpFlip uniform in vertex shader set to -1, except EID 99 and 167, where it is set to 1. Also, only EID 81 (CRT effect I assume) uses ps_vPosFlip which is set to vec2(1.0, 0.0).

Celeste, on the other hand, looks pretty broken for me as well, though in a different way than your gif earlier. It would help a lot to have somewhat similar capture from glsl run of the game for comparison.

There is a bunch of API use errors about vertex buffers not being bound at EIDs 16, 21, 26 and 44 which then seem to not do anything. I don't think this is related to spirv emitter (code for binding VBs is the same as glsl), but whether it is RenderDoc bug or Celeste I don't know.

Then, there is some unusual stuff happening at EID 145 and 223. In both cases, there are two textures bound to pixel shader, but Pipeline State tab only shows one. Also, call stream in API Inspector shows that two different textures were bound, but Texture Viewer shows the same thing in both slots. I wonder whether Render Doc confusion stems from some error in emitted spirv, or its a bug in RD itself. EID 145 seems to be doing distortion using what looks like normal map bound in slot 1. Assuming there is some issue with this binding and shader instead computes distortion using whatever is bound in slot 0, it might explain weird mess this produces.

celeste-distort

I will try to figure out what might be causing the problem, but @TheSpydog, if you could do a capture from Celeste that uses GLSL emitter for comparison, it would help a lot.

TheSpydog commented 5 years ago

Interesting. On my PC it is upside-down at EID 52, flips to its correct orientation at EID 81, and then gets flipped again (back to being upside-down) at EID 167. Also worth noting that at EID 81, ps_vposFlip is (0.0, 0.0), not (1, 0). And vpFlip is showing up as 0.00 for all EIDs.

chasm_spirv_vpflip

For Celeste, I noticed that the graphics glitch still happens even with FNATEST_DISABLE_SPIRV=1. But it doesn't occur in the latest stock fnalibs distribution, so I don't know what's up there. Anyway, here's a capture of the correct glsl version:

celeste_glsl_dash.zip

baldurk commented 5 years ago

The vertex buffer errors are due to use of deprecated compatibility profile functionality - before those draws with the errors you can see calls to glVertexAttribPointer with no buffer and an offset of 0xFFFFFFFFDEADBEEF. That's what RenderDoc serialises when you use client memory vertex pointers (i.e. no vertex buffer, just drawing straight from CPU memory pointers).

The texture binding issue is a little weirder. It seems like your shaders do not specify the texture binding in the SPIR-V as I'd expect, and instead sets a location and relies on glUniform1i or something like that to set the binding based on a fixed location. This seems to work but I'm really not sure if it's valid or intended, the spec leaves this is as kind of a grey area:

  1. How do we change the location of samplers and images?

    RESOLVED. You don't. Well you can using Uniform1i as usual, but you have no way of knowing which location corresponds to which sampler or image variable as GetUniformLocation won't work without a named variable. Best to just treat them as immutable bindings which are specified in the shader source or binary.

Which seems to imply that it might be legal to do this, but at the same time casts doubt on whether it's intended to be able to annotate samplers with a location.

At the moment RenderDoc ignores setting sampler uniform values for SPIR-V shaders, if I let the code set them for all shaders then the capture loads looking more correct. I'm really hesitant to do that though in case it breaks on another implementation (I'm using nvidia which is notoriously forgiving with GL).

krolli commented 5 years ago

Are you using your own build of FNA or one from archive at the top? I don't think FNATEST_DISABLE_SPIRV is in those sources.

Anyway, I noticed something that might have been the cause of RenderDoc showing only one texture. Who knows, it may have caused other issues as well. I pushed the change so if you have time, could you try it out and see if it helps Celeste? Even if the game is still broken, a RenderDoc capture from this version might be in better shape.

krolli commented 5 years ago

Ah, github and its page refreshes ... :(

I was looking at the same thing and I figured adding bindings is simple for me so I did it. Just as you, I am a bit unclear on this whole thing. I am pretty sure location is mandatory for inputs, outputs and uniform variables, but samplers are a bit weird in that regard. You definitely can't query them by name (as names can be stripped from spirv) and judging by my previous experience, not specifying location will just cause driver to assign same location to everything.

Simple fix I pushed just now should assign unique bindings and locations to samplers within pixel shader, but there is a risk of conflicting bindings between vertex and pixel shaders. I don't think Celeste uses vertex textures so it should be fine for testing it and if it works, I will just make bindings unique across whole program (same as locations are right now).

Hmm, glVertexAttribPointer issue sounds like something that should be dealt with by enabling FNA_OPENGL_FORCE_CORE_PROFILE. Didn't know RenderDoc even works with the other FNA OpenGL device. It definitely doesn't for me, when trying to debug Simply Chess. :(

flibitijibibo commented 5 years ago

FNA doesn’t do anything special for client arrays, since I usually expect ARB_debug_output to throw the error for Core. A temp hack could be added to make a staging buffer for client array input, but it might get ugly...

TheSpydog commented 5 years ago

Tried Celeste with the latest. Doesn't seem to make a difference. Attached an updated capture.

celeste_spirv_dash_updated.zip

krolli commented 5 years ago

@flibitijibibo Nah, I didn't mean to suggest doing anything about it. I don't know that much about how GL devices are implemented in FNA and what is exposed by original XNA API, so this was just my misunderstanding.

@TheSpydog Well, capture is looking correct when I view it in RenderDoc (barring issue with client arrays), so I guess that's at least some improvement. Distortion effect is there too, although capture was almost at the end of it so it is just barely visible. When you open the capture, can you see some point at which it breaks down? I wish I had more hardware to test on than damn idiot-proof nvidia ...

TheSpydog commented 5 years ago

Agh, thought I had more of the distortion effect in the capture, sorry about that. EID 193 is where it goes screwy.

krolli commented 5 years ago

I found out SINCOS opcode had swapped sine and cosine values. It also used OpUndef to fill in values that (in theory) shouldn't matter. I have seen OpUndef cause driver crashes so I replaced it with zeros just to be sure. Since SINCOS is used in distort effect, it might help with the problem, so feel free to try it when you have time.

TheSpydog commented 5 years ago

That fixed the issue! Great work. :smile:

krolli commented 5 years ago

Great, glad to hear that. Still more bugs to be squashed though. :)

krolli commented 5 years ago

@baldurk I'm going over the ARB_gl_spirv again looking for anything regarding bindings and locations of uniforms and I found this (in section "7.4.spv, SPIR-V Shader Interface Matching"):

Uniform and shader storage block variables must also be decorated with a Binding.

I think it's safe to say that this was a mistake on our side and you have it correct.

edit: Ok, I guess we've hit exactly the case that question you quoted considers impossible, or at least unlikely. I was confused a bit by wording, since it said "change the location of" but it means "change the binding of". They argue that there is no way of knowing the location required to change binding without querying it by name, but in our case we do have a way of knowing the location, since we assign it ourselves. However, it also seems that glslangValidator (at least version 7.11.3113 that I have) doesn't require location on samplers, only binding. It may be that drivers will silently generate unique location for each sampler in program when linking it, then assign binding to that location based on what is found in decorations, but still allow user to change the binding value if they somehow got the location right.

TheSpydog commented 5 years ago

All right, after a lot of testing this weekend I can confirm that Reus, Dust: AET, Wizorb, Capsized, and TowerFall all seem to work without issue on my Intel HD Graphics machine.

I also tested Apotheon, which boots but just displays a black screen with some version text at the top. @krolli I sent you a DM on Discord with the captures and the shader files.

krolli commented 5 years ago

Well, this looks like another case of "works on my machine" capture. Can you take a look and see if you spot anything that goes wrong? I can clearly see what results should look like at EID 125 (end of Color Pass #5) just without lighting yet. After that there are a few more passes that apply lighting and some postprocesses. Both GLSL and SPIRV captures look the same.

Also ran all the shaders through testparse and none of them produced any error or warning.

TheSpydog commented 5 years ago

Everything looks fine up through EID 125. Color Pass 6 is where it seems to go wrong. I also get two issues in RenderDoc:

  1. EID 149: Incorrect API Use | No vertex buffer bound to attribute vs_v0 (buffer slot 0) at draw! This can be caused by deleting a buffer early, before all draws using it have been made
  2. Same as above but with vs_v1 (buffer slot 1) instead.
flibitijibibo commented 5 years ago

Apotheon’s a SpriteBatch game, so it should be on a (basically global) VBO. Why it would think it’s disposed I’m not sure.

Cryptark is pretty similar in nature so traces of both will probably act the same.

flibitijibibo commented 5 years ago

fnaSPIRV.tar.bz2 has been updated with the latest FNA/FAudio and today's SPIR-V work.

TheSpydog commented 5 years ago

Whoops, accidentally closed the issue! What I meant to say is that I made a couple discoveries about Apotheon…

1) The warning messages about "no vertex buffer bound" are present for me in the glsl capture as well, so they're probably not relevant. 2) I noticed this when comparing the spirv and glsl captures:

apotheon

(spirv on the left, glsl on the right.) Could the swapped inputs be at fault somehow?

krolli commented 5 years ago

They certainly could. I wanted to get a better look at how sampler binding works in mojoshader/FNA to see if the default binding I put into SPIRV could be improved. Weirdly, it looks like ps_s0 and ps_s1 have same textures bound. It might be the order in which they are reported from reflection APIs that doesn't match between the two. I think RenderDoc uses various OpenGL calls for reflection with GLSL, but does its own parsing with SPIRV.

flibitijibibo commented 5 years ago

I would expect some other titles to be affected by that if it was the emitter (anything with BloomCombine/BloomExtract/GaussianBlur should use 2 samplers, for example). A possible test would be a shader that uses 3+ samplers all at once and see how RenderDoc orders the inputs.

flibitijibibo commented 5 years ago

Updated fnaSPIRV.tar.bz2 one more time, this time it's a clean master build, which has some new changes. In addition to FORCE_CORE_PROFILE defaulting to 4.6 (this was already done in the test binary), there's now an official environment variable for shader profiles, FNA_GRAPHICS_MOJOSHADER_PROFILE. Set it to spirv or glsl120 (OpenGLDevice defaults to glsl120, ModernGLDevice defaults to spirv if MojoShader has it).

flibitijibibo commented 5 years ago

Escape Goat 2 now works! Not sure what fixed it but it seems fine now.

Murder Miners has one shader not compiling, HologramBlockEffect. It appears to just be using NORMAL0 as the attribute that freaks it out. If you take a sample shader and arbitrarily change an input to be NORMAL0 it should fail to emit in spirv mode.

EDIT: Seems like we just need to support arbitrary attributes, rather than just the builtins.

flibitijibibo commented 5 years ago

Tried out Flotilla. It boots, but stars are missing (PointSize) and performance is in the single digits! I'm also seeing lots of performance warnings that aren't present in the glsl120 output. It may just be a case where an Exception is getting thrown and it's causing lots of CPU perf issues unrelated to what the SPIR-V output is.

flibitijibibo commented 5 years ago

Super dumb question: I only just noticed that in spirv the vposFlip name is actually “ps_vPosFlip”. If we change that to “vposFlip” does that fix Chasm?

(For some reason I’m now wondering if that flip should be in the FLIP_RENDERTARGET define with the unmodified builtin value as the else case...)

krolli commented 5 years ago

Spirv names shouldn't matter as far as I know. It is valid for these to be stripped from shader, so we shouldn't use any function that queries things by name. That said, changing it is simple to try and I actually thought the glsl name had ps_ prefix.

krolli commented 5 years ago

Rename is in, along with some simple support for NORMAL semantic. Murder Miners should now work.

I had a look at SM_30 attributes and it may be a bit more work to do proper support for it. As I understand it, basically any non-built-in semantic can have index up to 7. Currently, there is direct mapping semantic+index -> location, but assuming 8 indices for almost all of them would blow up the number of locations with most of them unused. I could assign them dynamically, but then values will depend on the order in which they appear in `dcl` instructions, which doesn't have to match between vertex and pixel shader. I guess this will also have to be patched during linking, same as uniforms. :(

Flotilla is interesting. I checked the code and PointSize built-in appears to be handled, though maybe not correctly. I just found one Psychonauts shader that uses it and I think the problem is that spirv declares it as vec4 (as almost every other register) while it should only be float. I will take a look at it later and let you know when it's in.

TheSpydog commented 5 years ago

No luck with Chasm unfortunately. Murder Miners does indeed boot now, but the main menu background is missing. It's just black in the background on spirv. Looks like an issue with color pass 1 from my brief investigation. Attaching glsl/spirv captures for comparison.

murderminers.zip