GPUOpen-Drivers / AMDVLK

AMD Open Source Driver For Vulkan
MIT License
1.7k stars 160 forks source link

Artifacts when using `gl_ViewIndex` in a non-multiview pass #307

Closed glebov-andrey closed 1 year ago

glebov-andrey commented 1 year ago

Hi, I believe I've found a bug in the handling of the ViewIndex SPIR-V builtin when multiview is disabled.

My application conditionally uses multiview rendering for VR. Shaders use gl_ViewIndex unconditionally, assuming that when multiview is disabled, its value is 0. This is guaranteed by the spec. just above here:

If multiview is enabled in the render pass, this value will be one of the bits set in the view mask of the subpass the pipeline is compiled against. If multiview is not enabled in the render pass, this value will be zero.

When running the application, specifically with AMDVLK drivers on Linux, some render passes produce artifacts with multiview disabled, but not with it enabled (with single-layer framebuffers in both cases). gl_ViewIndex is used in vertex, geometry and fragment shaders to index into uniform buffers and sampler2DArrays. When I replace gl_ViewIndex with 0 in the geometry and fragment shaders, the artifacts disappear. Note: Since AMDVLK does not support multiviewGeometryShader, the pass which uses a geometry shader gets disabled when multiview is enabled.

I've checked all of this with an up-to-date version of AMDVLK built from source (with assertions and ASAN). Validation layers don't complain about anything. RADV and the Windows driver with the same GPU are Ok, and so is NVIDIA. The artifacts look like random patches of geometry rendered across the viewport in random patterns or stripes. When viewing a frame capture in RenderDoc, the artifacts persist but are random from replay to replay. The artifacts appear over regions which failed the depth test.

My best guess, based on skimming over the driver sources, is that LLPC doesn't replace the ViewIndex builtin with 0 for geometry and fragment shaders. I found the following code which does this for vertex and mesh shaders (in another file), but nothing similar for other stages: https://github.com/GPUOpen-Drivers/llpc/blob/4ec1290558efeb1c90bacc21c3b4ecd0cc8244ee/lgc/patch/PatchInOutImportExport.cpp#L2090-L2095 Of cource, this could be handled some other way which I haven't found.

The other reason this seems like a reasonable explanation, is the assembly for one of the fragment shaders. Firstly, the assembly of the fragment shader with gl_ViewIndex is identical, whether or not multiview is enabled, while the vertex shader changes significantly (specifically around load offsets). Secondly, the assembly changes when I replace gl_ViewIndex with 0.

Here is the assembly output of one of the fragment shaders. The only thing it does, is loads a value from a uniform buffer at an offset based on gl_ViewIndex and writes that to an attachment. And this is enough to cause artifacts.

_amdgpu_ps_main: 
    s_mov_b32 m0, s4                                                                                    // 000000001500: BEFC0304                                                                           
    s_mov_b32 s8, s2                                                                                    // 000000001504: BE880302                                                                           
    v_interp_mov_f32_e32 v0, p0, attr0.x                                                                // 000000001508: C8020002                                                                           
    s_mov_b32 s11, 0x21014fac                                                                           // 00000000150C: BE8B03FF 21014FAC                                                                  
    s_mov_b32 s10, -1                                                                                   // 000000001514: BE8A03C1                                                                           
    s_and_b32 s9, s3, 0xffff                                                                            // 000000001518: 8709FF03 0000FFFF                                                                  
    v_lshlrev_b32_e32 v0, 9, v0                                                                         // 000000001520: 34000089                                                                           
    buffer_load_dwordx3 v[0:2], v0, s[8:11], 0 offen offset:48                                          // 000000001524: E03C1030 80020000                                                                  
    s_waitcnt vmcnt(0)                                                                                  // 00000000152C: BF8C3F70                                                                           
    v_cvt_pkrtz_f16_f32_e32 v0, v0, v1                                                                  // 000000001530: 5E000300                                                                           
    v_cvt_pkrtz_f16_f32_e64 v1, v2, s0                                                                  // 000000001534: D52F0001 00000102                                                                  
    exp mrt0 v0, v0, v1, v1 done compr vm                                                               // 00000000153C: F8001C0F 00000100                                                                  
    s_endpgm

For reference:

If the issue turns out to be more complicated, then I can prepare a simplified reproducer and a GFXReconstruct capture of it, or something.

amdrexu commented 1 year ago

Thank you. We will take a look at this issue. Currently, we use a generic output to export view index to fragment shader. This is unnecessary. View index could be obtained in fragment shader directly via system value input (a user data SGPR). We will rework this.

glebov-andrey commented 1 year ago

Currently, we use a generic output to export view index to fragment shader.

Ok, in that case the fragment shader assembly makes sense. But it would then suggest that the vertex shader is either writing the wrong values to that output, or writing them to the wrong location.

Is there any more information I could provide that would help?

amdrexu commented 1 year ago

Is there any more information I could provide that would help?

If you can attach a simplified pipeline (including your VS+GS+FS) in this ticket, that would be fine to us.

glebov-andrey commented 1 year ago

Do you mean the GLSL source or SPIR-V? And would a GFXReconstruct capture be better for completeness?

amdrexu commented 1 year ago

Do you mean the GLSL source or SPIR-V?

Yes, the simplified GLSL sources of VS+GS+FS would be enough for us to investigate it.

glebov-andrey commented 1 year ago
// .vert
layout(location = 0) in vec2 vertex;
void main() { gl_Position = vec4(vertex, 1.0, 1.0); }

//.frag
#extension GL_EXT_multiview : require
layout(location = 0) out vec4 normal;
void main() { normal.xyz = vec3(1.0, 0.0, float(gl_ViewIndex)); }

Shaders compiled with glslc v2022.3 with -O -std=460 --target-env=vulkan1.3.

Rough pipeline state:

amdrexu commented 1 year ago

Got it. Thank you.

amdrexu commented 1 year ago

Try to fix your problem: https://github.com/GPUOpen-Drivers/llpc/pull/2162

glebov-andrey commented 1 year ago

Thank you, that fixed it - the artifacts are gone.