microsoft / DirectXShaderCompiler

This repo hosts the source for the DirectX Shader Compiler which is based on LLVM/Clang.
Other
3.11k stars 697 forks source link

SPIRV validation error when using min16 types get d #6914

Open castano opened 2 months ago

castano commented 2 months ago

Description Compiling one of our HLSL shaders with the following command line:

dxc.exe spark_bc1_rgb_q0.mp.i2i.comp.hlsl -nologo -T cs_6_0 -spirv -fspv-target-env=vulkan1.0 -DVULKAN -Fo spark_bc1_rgb_q0.mp.i2i.comp.hlsl.dxc.spv

Produces this error:

fatal error: generated SPIR-V is invalid: ID '51' decorated with RelaxedPrecision multiple times is not allowed.
  %51 = OpExtInst %float %1 NMax %49 %50

Which seems to be caused by the following spirv validation code:

https://github.com/KhronosGroup/SPIRV-Tools/blob/4451f6ab13dda98bf255a7cd7b4d120132dc0dfd/source/val/validate_decorations.cpp#L1562

I know this report would be much more useful with source code to reproduce the problem, but I'm not able to publish this shader openly and I haven't been able to produce a minimal repro yet.

Before I start debugging this myself, it would be helpful to know if there's any command line option to have DXC output the spv file even though validation failed. Inspecting the spv file may be helpful to understand what part of the code is causing the issue and possibly build a simplified repro.

Environment

s-perron commented 2 months ago

You can add the -Vd to skip validation. This should emit the Spir-V

s-perron commented 2 months ago

To generate a reproducer, you could also try -fcgl. This will turn off all optimizations. If this passes validation, you can run the optimizations from the command line and use spirv-reduce to get a small reproducer.

I'll give more details on Monday.

castano commented 2 months ago

Hi Steven.

This is very useful, thanks!

I can tell exactly what statement is introducing the error now, but it's a bit of a mystery why this is happening, because it's present in all variants of the same shader and not all of them have this issue.

    norm = max(max(abs(r), abs(g)), abs(b));
         %53 = OpExtInst %float %1 FAbs %41
         %54 = OpExtInst %float %1 FAbs %46
         %55 = OpExtInst %float %1 NMax %53 %54
         %56 = OpExtInst %float %1 FAbs %51
         %57 = OpExtInst %float %1 NMax %55 %56

The duplicate decoration appears conspicuously out of place:

       OpDecorate %210 RelaxedPrecision
       OpDecorate %57 RelaxedPrecision
       OpDecorate %211 RelaxedPrecision

I tried with -fcgl and that seems to eliminate the validation error, but triggers an exception:

SPIRV-Cross threw an exception: RayQuery requires Vulkan GLSL 460.

However, -O0 also triggers the same exception and also has the validation error.

I've also confirmed that the problem still exists in the latest dxc release:

dxcompiler.dll: 1.8 - 1.8.2407.7 (416fab6b5); dxil.dll: 1.8(101.8.2407.12)

Digging a bit more, it seems the problem is caused by that statement being in an unrolled loop where the iteration count depends on an expression that's known at compile time.

    if (fast) iteration_count = 1;
    [unroll] for (int i = 0; i < iteration_count; i++) {
    }

Removing the if (fast) statement or the [unroll] attribute fixes the problem.

I've got a workaround that avoid the issue, but I'd be happy to dig more in order to get to the root of the problem.

s-perron commented 2 months ago

Here are the step I would take to reduce the tests case:

  1. Dump the unoptimized spir-v from DXC. This is described above, and see to me working.
  2. Identify the pass and the input to the pass that generates the invalid code.
spirv-opt t.spv -o o.spv --validate-after-all --print-all 2>&1 | tee log  ; t.spv is the output from step 1.

You can look at the last line of the form ; IR before pass <pass>. The spir-v that follows is the input to the failing pass and failing pass is <pass>.

  1. Move the failing input into a new spir-v assembly file and assemble it:
spirv-as fail.spvasm -o fail.spv --target-env=<your environment>
  1. Write a script that will return 0 if the given spir-v fails if the same error occurs:
echo 'spirv-opt $1 -o o.spv --<pass> && spirv-val o.spv | grep "decorated with RelaxedPrecision multiple times"' > b.sh
  1. Run spirv-reduce:
spirv-reduce fail.spv -o reduced.spv -- ./b.sh

The final reduced.spv, should be a much smaller test case that hopefully you can provide me. I should be able to get a fix from that.

castano commented 2 months ago

Here's a reduced repro: bugreport.zip

Let me know if you need anything else!

s-perron commented 2 months ago

This is a bug in the spirv-opt unroller, so it should be fixed. There is a workaround in most cases (do not mark the loop as unroll), so we will probably not get to it this release. I'll move to the next right away.