Problematic `switch(0u): default` branch idiom in merge return pass

hugoam commented 4 years ago

I have opened an issue about this in glslang, before figuring out it's actually due to a recent change in SPIRV-Tools https://github.com/KhronosGroup/glslang/issues/2148

Let me copy the initial issue here under The only additional information I've uncovered since, is that this is indeed due to a quite recent change in how the merge return pass is implemented: https://github.com/KhronosGroup/SPIRV-Tools/commit/e7afeb060e1637ca964c62517c3e6c225704b7eb which was undertaken under this issue https://github.com/KhronosGroup/SPIRV-Tools/issues/3127

It seems glslang has come up with (or increased the frequency of), sometimes in the previous months, I think, a rather peculiar pattern which looks something like this (in pseudo-code since the output is actually SPIR-V):

    vec4 _801;
    switch (0u)
    {
        default:
        {
            if (/* ... */)
            {
                _801 = /* assign a value */;
                break;
            }
            _801 = /* assign another value */;
            break;
        }
    }

This seems to happen in the HLSL to SPIR-V path at least, when inlining a function that conditionally returns different possible values, although I couldn't reduce my sample far enough to isolate exactly when that happens. Is there a good specific reason for this peculiar idiom ?

Albeit not incorrect, this kind of code is problematic in the case of D3D related pipelines. When fed to the D3D shader compiler, such code, having gone through glslang at some point, turns out to be highly problematic, as the D3D compiler is both incapable to correctly analyse that no code paths ends up with the variable potentially uninitialized — in the case of those tricky switch statements, and at the same time very picky about reading from unitialized variables, producing hard compile errors in those cases.

I'm not opening this issue to unequivocally say: it's something glslang should fix, but if you have any insights about how this was introduced and potential ways to workaround this problem. The responsibility of fixing this could be pushed to the next cross compiler in the chain, but it seems a rather complex task, so I'm pessimistic about the feasability. Fixing the D3D compiler itself doesn't seem easily doable either. Finally, would it be conceivable to make this behavior somehow togglable trough some sort of option ?

alan-baker commented 4 years ago

This code is generated by the merge return pass. HLSL code generation undergoes optimization so that it can be legalized (generally a direct translation of HLSL produces invalid SPIR-V). Merge return is an aid to function inlining that maintains structured control flow rules. Our experience has been that a transform of this sort is generally necessary, though previously it was done with a single iteration loop.

hugoam commented 4 years ago

I understand that, roughly, but it seems merge return was previously done with a dummy loop, and is now done with a dummy switch. The problem, as it turns out, is that when feeding back the result to a D3D compiler, the dummy switch causes unrecoverable compilation failure, as it prevents it to do correct uninitialized variable analysis :/

If at least there was a way to turn off this unusual pattern for some specific targets that cannot deal with this pattern, but this change, the way it was implemented, leaves no room for that I understand the merge return is a tricky beast, but this seems like that change: https://github.com/KhronosGroup/SPIRV-Tools/commit/e7afeb060e1637ca964c62517c3e6c225704b7eb introduced a "solve one bug, create more bugs" kind of situation

alan-baker commented 4 years ago

IIRC the motivation to switch from the loop to the switch was to aid loop unrolling (e.g. function needs inlined into a loop that is requested to be unrolled). I appreciate your situation though. Could you perhaps describe your flow in more detail? We'll discuss this internally.

hugoam commented 4 years ago

This is in the context of feeding shaders to WebGPU, of which the implementation, on a Windows platform, will translate back the SPIR-V to HLSL and feed it to D3D12 after compilation by DXC or FXC In the long-term, this might change, with the future development of the WGSL shading language, but until then this is a huge blocker for development targeting WebGPU Basically a third of my shaders don't work anymore in that pipeline, which will probably cause me to revert to a previous version of glslang + spirv-tools Edit: (hit "close and comment" by mistake, sorry)

alan-baker commented 4 years ago

Thanks for reaching out about this, we'll see what we can do.

dneto0 commented 4 years ago

How about forcing an initialization of the variable?

hugoam commented 4 years ago

If the merge return pass had an option to force that initialization, yes, I believe that could work !

s-perron commented 4 years ago

Thanks for bringing this to our attention. We have discussed it, and we believe this might be a bigger issue that just merge return. WebGPU requires all variable to have initizlization. However, the SSA names that appear in spir-v do not have initization. Without being able to try things out for myself, we cannot see how pervasive this problem might be.

Our suggested solution, would be make sure that all of the tools create inititializers for all variables that is creates. In this case, I believe we already do that in merge return, the problem is that the information is lost as we optimize the spir-v. We might need spirv-cross to make a change. We suspect that spirv-cross is creating the variable _801 to hold the result of an OpPhi instruction. Spirv-opt cannot create an initializer for _801 because it does not exist in the spir-v. It should be up to spirv-cross.

To confirm this, could you please provide us with the spir-v or look at it yourself to verify that this value is just as SSA value. If that is the case, we could look at spir-v cross. Under an option, it could initialize all variables that it creates to something.

hugoam commented 4 years ago

Here is the compiled shader: I can't check the produced HLSL right now, since the computer on which I do this is plugged off, nor decipher this SPIR-V to find if the variable is initialized around OpSwitch %uint_0 %941, but you might have more luck in this than me

; SPIR-V
; Version: 1.0
; Generator: Khronos Glslang Reference Front End; 8
; Bound: 1978
; Schema: 0
               OpCapability Shader
          %1 = OpExtInstImport "GLSL.std.450"
               OpMemoryModel Logical GLSL450
               OpEntryPoint Fragment %main "main" %v_texcoord0 %bgfx_FragData0
               OpExecutionMode %main OriginUpperLeft
               OpSource HLSL 500
               OpName %main "main"
               OpName %s_normalSampler "s_normalSampler"
               OpName %s_normalTexture "s_normalTexture"
               OpName %s_colorSampler "s_colorSampler"
               OpName %s_colorTexture "s_colorTexture"
               OpName %s_lightSampler "s_lightSampler"
               OpName %s_lightTexture "s_lightTexture"
               OpName %s_depthSampler "s_depthSampler"
               OpName %s_depthTexture "s_depthTexture"
               OpName %s_shadowMapSamplerComparison "s_shadowMapSamplerComparison"
               OpName %s_shadowMapTexture "s_shadowMapTexture"
               OpName %UniformBlock "UniformBlock"
               OpMemberName %UniformBlock 0 "u_lightDir"
               OpMemberName %UniformBlock 1 "u_invMvp"
               OpMemberName %UniformBlock 2 "u_lightMtx"
               OpMemberName %UniformBlock 3 "u_shadowDimsInv"
               OpMemberName %UniformBlock 4 "u_rsmAmount"
               OpName %_ ""
               OpName %v_texcoord0 "v_texcoord0"
               OpName %bgfx_FragData0 "bgfx_FragData0"
               OpDecorate %s_normalSampler DescriptorSet 2
               OpDecorate %s_normalSampler Binding 0
               OpDecorate %s_normalTexture DescriptorSet 1
               OpDecorate %s_normalTexture Binding 0
               OpDecorate %s_colorSampler DescriptorSet 2
               OpDecorate %s_colorSampler Binding 1
               OpDecorate %s_colorTexture DescriptorSet 1
               OpDecorate %s_colorTexture Binding 1
               OpDecorate %s_lightSampler DescriptorSet 2
               OpDecorate %s_lightSampler Binding 2
               OpDecorate %s_lightTexture DescriptorSet 1
               OpDecorate %s_lightTexture Binding 2
               OpDecorate %s_depthSampler DescriptorSet 2
               OpDecorate %s_depthSampler Binding 3
               OpDecorate %s_depthTexture DescriptorSet 1
               OpDecorate %s_depthTexture Binding 3
               OpDecorate %s_shadowMapSamplerComparison DescriptorSet 2
               OpDecorate %s_shadowMapSamplerComparison Binding 4
               OpDecorate %s_shadowMapTexture DescriptorSet 1
               OpDecorate %s_shadowMapTexture Binding 4
               OpMemberDecorate %UniformBlock 0 Offset 0
               OpMemberDecorate %UniformBlock 1 RowMajor
               OpMemberDecorate %UniformBlock 1 Offset 16
               OpMemberDecorate %UniformBlock 1 MatrixStride 16
               OpMemberDecorate %UniformBlock 2 RowMajor
               OpMemberDecorate %UniformBlock 2 Offset 80
               OpMemberDecorate %UniformBlock 2 MatrixStride 16
               OpMemberDecorate %UniformBlock 3 Offset 144
               OpMemberDecorate %UniformBlock 4 Offset 160
               OpDecorate %UniformBlock Block
               OpDecorate %_ DescriptorSet 0
               OpDecorate %_ Binding 1
               OpDecorate %v_texcoord0 Location 0
               OpDecorate %bgfx_FragData0 Location 0
       %void = OpTypeVoid
          %3 = OpTypeFunction %void
          %6 = OpTypeSampler
      %float = OpTypeFloat 32
          %8 = OpTypeImage %float 2D 0 0 0 1 Unknown
    %v2float = OpTypeVector %float 2
    %v4float = OpTypeVector %float 4
    %v3float = OpTypeVector %float 3
       %bool = OpTypeBool
     %v2bool = OpTypeVector %bool 2
%mat4v4float = OpTypeMatrix %v4float 4
%_ptr_UniformConstant_6 = OpTypePointer UniformConstant %6
%s_normalSampler = OpVariable %_ptr_UniformConstant_6 UniformConstant
%_ptr_UniformConstant_8 = OpTypePointer UniformConstant %8
%s_normalTexture = OpVariable %_ptr_UniformConstant_8 UniformConstant
%s_colorSampler = OpVariable %_ptr_UniformConstant_6 UniformConstant
%s_colorTexture = OpVariable %_ptr_UniformConstant_8 UniformConstant
%s_lightSampler = OpVariable %_ptr_UniformConstant_6 UniformConstant
%s_lightTexture = OpVariable %_ptr_UniformConstant_8 UniformConstant
%s_depthSampler = OpVariable %_ptr_UniformConstant_6 UniformConstant
%s_depthTexture = OpVariable %_ptr_UniformConstant_8 UniformConstant
%s_shadowMapSamplerComparison = OpVariable %_ptr_UniformConstant_6 UniformConstant
%s_shadowMapTexture = OpVariable %_ptr_UniformConstant_8 UniformConstant
        %int = OpTypeInt 32 1
      %int_1 = OpConstant %int 1
      %int_0 = OpConstant %int 0
        %127 = OpTypeSampledImage %8
        %137 = OpTypeImage %float 2D 1 0 0 1 Unknown
        %138 = OpTypeSampledImage %137
       %uint = OpTypeInt 32 0
    %float_0 = OpConstant %float 0
    %float_1 = OpConstant %float 1
 %float_n1_5 = OpConstant %float -1.5
        %233 = OpConstantComposite %v2float %float_n1_5 %float_n1_5
 %float_n0_5 = OpConstant %float -0.5
        %250 = OpConstantComposite %v2float %float_n1_5 %float_n0_5
  %float_0_5 = OpConstant %float 0.5
        %267 = OpConstantComposite %v2float %float_n1_5 %float_0_5
  %float_1_5 = OpConstant %float 1.5
        %284 = OpConstantComposite %v2float %float_n1_5 %float_1_5
        %300 = OpConstantComposite %v2float %float_n0_5 %float_n1_5
        %316 = OpConstantComposite %v2float %float_n0_5 %float_n0_5
        %332 = OpConstantComposite %v2float %float_n0_5 %float_0_5
        %348 = OpConstantComposite %v2float %float_n0_5 %float_1_5
        %364 = OpConstantComposite %v2float %float_0_5 %float_n1_5
        %380 = OpConstantComposite %v2float %float_0_5 %float_n0_5
        %396 = OpConstantComposite %v2float %float_0_5 %float_0_5
        %412 = OpConstantComposite %v2float %float_0_5 %float_1_5
        %428 = OpConstantComposite %v2float %float_1_5 %float_n1_5
        %444 = OpConstantComposite %v2float %float_1_5 %float_n0_5
        %460 = OpConstantComposite %v2float %float_1_5 %float_0_5
        %476 = OpConstantComposite %v2float %float_1_5 %float_1_5
    %float_2 = OpConstant %float 2
   %float_n1 = OpConstant %float -1
%UniformBlock = OpTypeStruct %v4float %mat4v4float %mat4v4float %v4float %v4float
%_ptr_Uniform_UniformBlock = OpTypePointer Uniform %UniformBlock
          %_ = OpVariable %_ptr_Uniform_UniformBlock Uniform
%_ptr_Uniform_v4float = OpTypePointer Uniform %v4float
     %uint_0 = OpConstant %uint 0
%_ptr_Uniform_mat4v4float = OpTypePointer Uniform %mat4v4float
%float_0_00300000003 = OpConstant %float 0.00300000003
      %int_2 = OpConstant %int 2
%float_0_00100000005 = OpConstant %float 0.00100000005
      %int_3 = OpConstant %int 3
%_ptr_Uniform_float = OpTypePointer Uniform %float
      %int_4 = OpConstant %int 4
%_ptr_Input_v2float = OpTypePointer Input %v2float
%v_texcoord0 = OpVariable %_ptr_Input_v2float Input
%_ptr_Output_v4float = OpTypePointer Output %v4float
%bgfx_FragData0 = OpVariable %_ptr_Output_v4float Output
       %1971 = OpConstantComposite %v3float %float_n1 %float_n1 %float_n1
       %1972 = OpConstantComposite %v2float %float_1 %float_1
       %1973 = OpConstantNull %v4float
       %1974 = OpConstantComposite %v2float %float_0 %float_0
%float_0_0625 = OpConstant %float 0.0625
       %1976 = OpUndef %v4float
       %main = OpFunction %void None %3
          %5 = OpLabel
         %88 = OpLoad %6 %s_normalSampler
         %91 = OpLoad %8 %s_normalTexture
         %95 = OpLoad %6 %s_colorSampler
         %97 = OpLoad %8 %s_colorTexture
        %101 = OpLoad %6 %s_lightSampler
        %103 = OpLoad %8 %s_lightTexture
        %107 = OpLoad %6 %s_depthSampler
        %109 = OpLoad %8 %s_depthTexture
        %114 = OpLoad %6 %s_shadowMapSamplerComparison
        %116 = OpLoad %8 %s_shadowMapTexture
        %667 = OpLoad %v2float %v_texcoord0
        %845 = OpSampledImage %127 %91 %88
        %847 = OpImageSampleImplicitLod %v4float %845 %667
        %742 = OpVectorShuffle %v3float %847 %847 0 1 2
        %744 = OpVectorTimesScalar %v3float %742 %float_2
        %746 = OpFAdd %v3float %744 %1971
        %747 = OpAccessChain %_ptr_Uniform_v4float %_ %int_0
        %748 = OpLoad %v4float %747
        %749 = OpVectorShuffle %v3float %748 %748 0 1 2
        %752 = OpDot %float %746 %749
        %753 = OpExtInst %float %1 FMax %float_0 %752
        %853 = OpSampledImage %127 %109 %107
        %855 = OpImageSampleImplicitLod %v4float %853 %667
        %760 = OpCompositeExtract %float %855 0
        %858 = OpFMul %float %760 %float_2
        %859 = OpFSub %float %858 %float_1
        %764 = OpVectorTimesScalar %v2float %667 %float_2
        %766 = OpFSub %v2float %764 %1972
        %768 = OpCompositeExtract %float %766 0
        %769 = OpCompositeExtract %float %766 1
        %771 = OpAccessChain %_ptr_Uniform_mat4v4float %_ %int_1
        %772 = OpLoad %mat4v4float %771
        %866 = OpCompositeConstruct %v4float %768 %769 %859 %float_1
        %868 = OpVectorTimesMatrix %v4float %866 %772
        %870 = OpVectorShuffle %v3float %868 %868 0 1 2
        %872 = OpCompositeExtract %float %868 3
        %873 = OpCompositeConstruct %v3float %872 %872 %872
        %874 = OpFDiv %v3float %870 %873
        %777 = OpVectorTimesScalar %v3float %746 %float_0_00300000003
        %778 = OpFAdd %v3float %874 %777
        %780 = OpCompositeExtract %float %778 0
        %781 = OpCompositeExtract %float %778 1
        %782 = OpCompositeExtract %float %778 2
        %783 = OpCompositeConstruct %v4float %780 %781 %782 %float_1
        %784 = OpAccessChain %_ptr_Uniform_mat4v4float %_ %int_2
        %785 = OpLoad %mat4v4float %784
        %786 = OpVectorTimesMatrix %v4float %783 %785
        %787 = OpAccessChain %_ptr_Uniform_float %_ %int_3 %uint_0
        %788 = OpLoad %float %787
        %878 = OpCompositeConstruct %v2float %788 %788
        %791 = OpCompositeExtract %float %786 3
        %793 = OpVectorShuffle %v2float %786 %786 0 1
        %794 = OpCompositeConstruct %v2float %791 %791
        %795 = OpFDiv %v2float %793 %794
        %799 = OpVectorShuffle %v2float %795 %1973 0 1
        %800 = OpVectorTimesScalar %v2float %799 %float_0_5
        %802 = OpFAdd %v2float %800 %396
        %804 = OpVectorShuffle %v4float %786 %802 4 5 2 3
               OpSelectionMerge %940 None
               OpSwitch %uint_0 %941
        %941 = OpLabel
        %943 = OpVectorShuffle %v2float %802 %1973 0 1
       %1163 = OpFOrdGreaterThan %v2bool %943 %1972
        %947 = OpAny %bool %1163
       %1171 = OpFOrdLessThan %v2bool %943 %1974
        %951 = OpAny %bool %1171
        %952 = OpLogicalOr %bool %947 %951
               OpSelectionMerge %954 None
               OpBranchConditional %952 %955 %954
        %955 = OpLabel
               OpBranch %940
        %954 = OpLabel
        %958 = OpCompositeExtract %float %786 3
        %959 = OpVectorTimesScalar %v2float %878 %958
        %962 = OpFMul %v2float %233 %959
        %963 = OpCompositeExtract %float %962 0
        %964 = OpCompositeExtract %float %962 1
        %965 = OpCompositeConstruct %v4float %963 %964 %float_0 %float_0
        %966 = OpFAdd %v4float %804 %965
       %1180 = OpCompositeExtract %float %966 2
       %1182 = OpFSub %float %1180 %float_0_00100000005
       %1183 = OpCompositeExtract %float %966 0
       %1184 = OpCompositeExtract %float %966 1
       %1193 = OpSampledImage %138 %116 %114
       %1200 = OpCompositeConstruct %v3float %1183 %1184 %1182
       %1202 = OpImageSampleDrefExplicitLod %float %1193 %1200 %1182 Lod %float_0
        %974 = OpFMul %v2float %250 %959
        %975 = OpCompositeExtract %float %974 0
        %976 = OpCompositeExtract %float %974 1
        %977 = OpCompositeConstruct %v4float %975 %976 %float_0 %float_0
        %978 = OpFAdd %v4float %804 %977
       %1211 = OpCompositeExtract %float %978 2
       %1213 = OpFSub %float %1211 %float_0_00100000005
       %1214 = OpCompositeExtract %float %978 0
       %1215 = OpCompositeExtract %float %978 1
       %1224 = OpSampledImage %138 %116 %114
       %1231 = OpCompositeConstruct %v3float %1214 %1215 %1213
       %1233 = OpImageSampleDrefExplicitLod %float %1224 %1231 %1213 Lod %float_0
        %983 = OpFAdd %float %1202 %1233
        %986 = OpFMul %v2float %267 %959
        %987 = OpCompositeExtract %float %986 0
        %988 = OpCompositeExtract %float %986 1
        %989 = OpCompositeConstruct %v4float %987 %988 %float_0 %float_0
        %990 = OpFAdd %v4float %804 %989
       %1242 = OpCompositeExtract %float %990 2
       %1244 = OpFSub %float %1242 %float_0_00100000005
       %1245 = OpCompositeExtract %float %990 0
       %1246 = OpCompositeExtract %float %990 1
       %1255 = OpSampledImage %138 %116 %114
       %1262 = OpCompositeConstruct %v3float %1245 %1246 %1244
       %1264 = OpImageSampleDrefExplicitLod %float %1255 %1262 %1244 Lod %float_0
        %995 = OpFAdd %float %983 %1264
        %998 = OpFMul %v2float %284 %959
        %999 = OpCompositeExtract %float %998 0
       %1000 = OpCompositeExtract %float %998 1
       %1001 = OpCompositeConstruct %v4float %999 %1000 %float_0 %float_0
       %1002 = OpFAdd %v4float %804 %1001
       %1273 = OpCompositeExtract %float %1002 2
       %1275 = OpFSub %float %1273 %float_0_00100000005
       %1276 = OpCompositeExtract %float %1002 0
       %1277 = OpCompositeExtract %float %1002 1
       %1286 = OpSampledImage %138 %116 %114
       %1293 = OpCompositeConstruct %v3float %1276 %1277 %1275
       %1295 = OpImageSampleDrefExplicitLod %float %1286 %1293 %1275 Lod %float_0
       %1007 = OpFAdd %float %995 %1295
       %1010 = OpFMul %v2float %300 %959
       %1011 = OpCompositeExtract %float %1010 0
       %1012 = OpCompositeExtract %float %1010 1
       %1013 = OpCompositeConstruct %v4float %1011 %1012 %float_0 %float_0
       %1014 = OpFAdd %v4float %804 %1013
       %1304 = OpCompositeExtract %float %1014 2
       %1306 = OpFSub %float %1304 %float_0_00100000005
       %1307 = OpCompositeExtract %float %1014 0
       %1308 = OpCompositeExtract %float %1014 1
       %1317 = OpSampledImage %138 %116 %114
       %1324 = OpCompositeConstruct %v3float %1307 %1308 %1306
       %1326 = OpImageSampleDrefExplicitLod %float %1317 %1324 %1306 Lod %float_0
       %1019 = OpFAdd %float %1007 %1326
       %1022 = OpFMul %v2float %316 %959
       %1023 = OpCompositeExtract %float %1022 0
       %1024 = OpCompositeExtract %float %1022 1
       %1025 = OpCompositeConstruct %v4float %1023 %1024 %float_0 %float_0
       %1026 = OpFAdd %v4float %804 %1025
       %1335 = OpCompositeExtract %float %1026 2
       %1337 = OpFSub %float %1335 %float_0_00100000005
       %1338 = OpCompositeExtract %float %1026 0
       %1339 = OpCompositeExtract %float %1026 1
       %1348 = OpSampledImage %138 %116 %114
       %1355 = OpCompositeConstruct %v3float %1338 %1339 %1337
       %1357 = OpImageSampleDrefExplicitLod %float %1348 %1355 %1337 Lod %float_0
       %1031 = OpFAdd %float %1019 %1357
       %1034 = OpFMul %v2float %332 %959
       %1035 = OpCompositeExtract %float %1034 0
       %1036 = OpCompositeExtract %float %1034 1
       %1037 = OpCompositeConstruct %v4float %1035 %1036 %float_0 %float_0
       %1038 = OpFAdd %v4float %804 %1037
       %1366 = OpCompositeExtract %float %1038 2
       %1368 = OpFSub %float %1366 %float_0_00100000005
       %1369 = OpCompositeExtract %float %1038 0
       %1370 = OpCompositeExtract %float %1038 1
       %1379 = OpSampledImage %138 %116 %114
       %1386 = OpCompositeConstruct %v3float %1369 %1370 %1368
       %1388 = OpImageSampleDrefExplicitLod %float %1379 %1386 %1368 Lod %float_0
       %1043 = OpFAdd %float %1031 %1388
       %1046 = OpFMul %v2float %348 %959
       %1047 = OpCompositeExtract %float %1046 0
       %1048 = OpCompositeExtract %float %1046 1
       %1049 = OpCompositeConstruct %v4float %1047 %1048 %float_0 %float_0
       %1050 = OpFAdd %v4float %804 %1049
       %1397 = OpCompositeExtract %float %1050 2
       %1399 = OpFSub %float %1397 %float_0_00100000005
       %1400 = OpCompositeExtract %float %1050 0
       %1401 = OpCompositeExtract %float %1050 1
       %1410 = OpSampledImage %138 %116 %114
       %1417 = OpCompositeConstruct %v3float %1400 %1401 %1399
       %1419 = OpImageSampleDrefExplicitLod %float %1410 %1417 %1399 Lod %float_0
       %1055 = OpFAdd %float %1043 %1419
       %1058 = OpFMul %v2float %364 %959
       %1059 = OpCompositeExtract %float %1058 0
       %1060 = OpCompositeExtract %float %1058 1
       %1061 = OpCompositeConstruct %v4float %1059 %1060 %float_0 %float_0
       %1062 = OpFAdd %v4float %804 %1061
       %1428 = OpCompositeExtract %float %1062 2
       %1430 = OpFSub %float %1428 %float_0_00100000005
       %1431 = OpCompositeExtract %float %1062 0
       %1432 = OpCompositeExtract %float %1062 1
       %1441 = OpSampledImage %138 %116 %114
       %1448 = OpCompositeConstruct %v3float %1431 %1432 %1430
       %1450 = OpImageSampleDrefExplicitLod %float %1441 %1448 %1430 Lod %float_0
       %1067 = OpFAdd %float %1055 %1450
       %1070 = OpFMul %v2float %380 %959
       %1071 = OpCompositeExtract %float %1070 0
       %1072 = OpCompositeExtract %float %1070 1
       %1073 = OpCompositeConstruct %v4float %1071 %1072 %float_0 %float_0
       %1074 = OpFAdd %v4float %804 %1073
       %1459 = OpCompositeExtract %float %1074 2
       %1461 = OpFSub %float %1459 %float_0_00100000005
       %1462 = OpCompositeExtract %float %1074 0
       %1463 = OpCompositeExtract %float %1074 1
       %1472 = OpSampledImage %138 %116 %114
       %1479 = OpCompositeConstruct %v3float %1462 %1463 %1461
       %1481 = OpImageSampleDrefExplicitLod %float %1472 %1479 %1461 Lod %float_0
       %1079 = OpFAdd %float %1067 %1481
       %1082 = OpFMul %v2float %396 %959
       %1083 = OpCompositeExtract %float %1082 0
       %1084 = OpCompositeExtract %float %1082 1
       %1085 = OpCompositeConstruct %v4float %1083 %1084 %float_0 %float_0
       %1086 = OpFAdd %v4float %804 %1085
       %1490 = OpCompositeExtract %float %1086 2
       %1492 = OpFSub %float %1490 %float_0_00100000005
       %1493 = OpCompositeExtract %float %1086 0
       %1494 = OpCompositeExtract %float %1086 1
       %1503 = OpSampledImage %138 %116 %114
       %1510 = OpCompositeConstruct %v3float %1493 %1494 %1492
       %1512 = OpImageSampleDrefExplicitLod %float %1503 %1510 %1492 Lod %float_0
       %1091 = OpFAdd %float %1079 %1512
       %1094 = OpFMul %v2float %412 %959
       %1095 = OpCompositeExtract %float %1094 0
       %1096 = OpCompositeExtract %float %1094 1
       %1097 = OpCompositeConstruct %v4float %1095 %1096 %float_0 %float_0
       %1098 = OpFAdd %v4float %804 %1097
       %1521 = OpCompositeExtract %float %1098 2
       %1523 = OpFSub %float %1521 %float_0_00100000005
       %1524 = OpCompositeExtract %float %1098 0
       %1525 = OpCompositeExtract %float %1098 1
       %1534 = OpSampledImage %138 %116 %114
       %1541 = OpCompositeConstruct %v3float %1524 %1525 %1523
       %1543 = OpImageSampleDrefExplicitLod %float %1534 %1541 %1523 Lod %float_0
       %1103 = OpFAdd %float %1091 %1543
       %1106 = OpFMul %v2float %428 %959
       %1107 = OpCompositeExtract %float %1106 0
       %1108 = OpCompositeExtract %float %1106 1
       %1109 = OpCompositeConstruct %v4float %1107 %1108 %float_0 %float_0
       %1110 = OpFAdd %v4float %804 %1109
       %1552 = OpCompositeExtract %float %1110 2
       %1554 = OpFSub %float %1552 %float_0_00100000005
       %1555 = OpCompositeExtract %float %1110 0
       %1556 = OpCompositeExtract %float %1110 1
       %1565 = OpSampledImage %138 %116 %114
       %1572 = OpCompositeConstruct %v3float %1555 %1556 %1554
       %1574 = OpImageSampleDrefExplicitLod %float %1565 %1572 %1554 Lod %float_0
       %1115 = OpFAdd %float %1103 %1574
       %1118 = OpFMul %v2float %444 %959
       %1119 = OpCompositeExtract %float %1118 0
       %1120 = OpCompositeExtract %float %1118 1
       %1121 = OpCompositeConstruct %v4float %1119 %1120 %float_0 %float_0
       %1122 = OpFAdd %v4float %804 %1121
       %1583 = OpCompositeExtract %float %1122 2
       %1585 = OpFSub %float %1583 %float_0_00100000005
       %1586 = OpCompositeExtract %float %1122 0
       %1587 = OpCompositeExtract %float %1122 1
       %1596 = OpSampledImage %138 %116 %114
       %1603 = OpCompositeConstruct %v3float %1586 %1587 %1585
       %1605 = OpImageSampleDrefExplicitLod %float %1596 %1603 %1585 Lod %float_0
       %1127 = OpFAdd %float %1115 %1605
       %1130 = OpFMul %v2float %460 %959
       %1131 = OpCompositeExtract %float %1130 0
       %1132 = OpCompositeExtract %float %1130 1
       %1133 = OpCompositeConstruct %v4float %1131 %1132 %float_0 %float_0
       %1134 = OpFAdd %v4float %804 %1133
       %1614 = OpCompositeExtract %float %1134 2
       %1616 = OpFSub %float %1614 %float_0_00100000005
       %1617 = OpCompositeExtract %float %1134 0
       %1618 = OpCompositeExtract %float %1134 1
       %1627 = OpSampledImage %138 %116 %114
       %1634 = OpCompositeConstruct %v3float %1617 %1618 %1616
       %1636 = OpImageSampleDrefExplicitLod %float %1627 %1634 %1616 Lod %float_0
       %1139 = OpFAdd %float %1127 %1636
       %1142 = OpFMul %v2float %476 %959
       %1143 = OpCompositeExtract %float %1142 0
       %1144 = OpCompositeExtract %float %1142 1
       %1145 = OpCompositeConstruct %v4float %1143 %1144 %float_0 %float_0
       %1146 = OpFAdd %v4float %804 %1145
       %1645 = OpCompositeExtract %float %1146 2
       %1647 = OpFSub %float %1645 %float_0_00100000005
       %1648 = OpCompositeExtract %float %1146 0
       %1649 = OpCompositeExtract %float %1146 1
       %1658 = OpSampledImage %138 %116 %114
       %1665 = OpCompositeConstruct %v3float %1648 %1649 %1647
       %1667 = OpImageSampleDrefExplicitLod %float %1658 %1665 %1647 Lod %float_0
       %1151 = OpFAdd %float %1139 %1667
       %1153 = OpFMul %float %1151 %float_0_0625
               OpBranch %940
        %940 = OpLabel
       %1977 = OpPhi %float %float_1 %955 %1153 %954
        %812 = OpFMul %float %753 %1977
       %1673 = OpSampledImage %127 %97 %95
       %1675 = OpImageSampleImplicitLod %v4float %1673 %667
        %816 = OpVectorShuffle %v3float %1675 %1675 0 1 2
       %1681 = OpSampledImage %127 %103 %101
       %1683 = OpImageSampleImplicitLod %v4float %1681 %667
        %820 = OpVectorShuffle %v3float %1683 %1683 0 1 2
        %823 = OpVectorTimesScalar %v3float %816 %812
        %826 = OpFMul %v3float %820 %816
        %827 = OpAccessChain %_ptr_Uniform_float %_ %int_4 %uint_0
        %828 = OpLoad %float %827
        %829 = OpCompositeConstruct %v3float %828 %828 %828
       %1688 = OpExtInst %v3float %1 FMix %823 %826 %829
        %832 = OpVectorShuffle %v4float %1976 %1688 4 5 6 3
       %1969 = OpCompositeInsert %v4float %float_1 %832 3
               OpStore %bgfx_FragData0 %1969
               OpReturn
               OpFunctionEnd

greg-lunarg commented 4 years ago

Can we fix D3D12? Is this the open source compiler?

On Mar 23, 2020, at 8:03 AM, Steven Perron notifications@github.com wrote:

Thanks for bringing this to our attention. We have discussed it, and we believe this might be a bigger issue that just merge return. WebGPU requires all variable to have initizlization. However, the SSA names that appear in spir-v do not have initization. Without being able to try things out for myself, we cannot see how pervasive this problem might be.

Our suggested solution, would be make sure that all of the tools create inititializers for all variables that is creates. In this case, I believe we already do that in merge return, the problem is that the information is lost as we optimize the spir-v. We might need spirv-cross to make a change. We suspect that spirv-cross is creating the variable _801 to hold the result of an OpPhi instruction. Spirv-opt cannot create an initializer for _801 because it does not exist in the spir-v. It should be up to spirv-cross.

To confirm this, could you please provide us with the spir-v or look at it yourself to verify that this value is just as SSA value. If that is the case, we could look at spir-v cross. Under an option, it could initialize all variables that it creates to something.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

alan-baker commented 4 years ago

I expect the issue is:

%1977 = OpPhi %float %float_1 %955 %1153 %954

There are no Function scope variables in the module, so this indeed points to desiring a change in spirv-cross.

s-perron commented 4 years ago

I ran it through spirv-cross and I can confirm what Alan said. The uninitialized variable is caused by spirv-cross creating a new variable for %1977. We should try to get spirv-cross to generate an initializer for that variable.

s-perron commented 4 years ago

I've opened https://github.com/KhronosGroup/SPIRV-Cross/issues/1298 to ask the spirv-cross community what they can do.

greg-lunarg commented 4 years ago

Is this a true uninitialized variable from spirv-cross or D3D12 not understanding the switch?

s-perron commented 4 years ago

It could be that D3D does not understand the switch. However, I also wonder what the WebGPU spec is for the hlsl compiler. WebGPU for spir-v says that all variables must have an initializer. So the spir-v equivalent of

int a;
if (cond) a = 1;
else a = 0;

would be invalid, even if it is not used before it is given a value. This is for safety. The compiler is able to enforce the rule that there are no uninitialized variables without being forced to do an expensive analysis. Because of the philosophy of WebGPU, I believe that it is fair to ask spirv-cross to honour this with new variables that it creates.

hugoam commented 4 years ago

The issue itself arises because D3D does understand this:

int a;
if (cond) a = 1;
else a = 0;

But not this:

int a;
switch (0u)
{
    default:
    {
        if (cond)
        {
            a = 1;
            break;
        }
        a = 0;
        break;
    }
}

However, since those variables do not exist in the original shader, but are created through the ~optimization process, it would be useful if there was a way to ask the optimization tools to not generate uninitialized variables~. (Edit: through the subsequent cross-compiling step of that OpPhi, actually, sorry)

Also, note that for now, this issue arises independently of what the WebGPU spec itself enforces. If anything, even if this specific aspect is already subject to validation at this point (I do not know whether it is the case), WebGPU validation might very well correctly analyse that the variable is never left uninitialized. D3D is the faulty link here, but we're looking for a way to accomodate it, since this setup used to work in the beginning of january, before the merge return refactor.

Edit: Sorry, I fumbled up right there, if WebGPU mandates that all variables must have initializers, this is clearly not enforced yet by the validation step of the implementation I'm using here.

greg-lunarg commented 4 years ago

I concur with asking spirv-cross for a mode to initialize its local variables. Thanks!

hugoam commented 4 years ago

Yep, sorry, I only now documented myself about OpPhi, and this sounds good. I guess that also means that there is not an issue intrinsically in SPIRV-Tools regarding this either :)

Thank you all for the very thorough help about this issue !

s-perron commented 4 years ago

Update to the latest spirv-cross, and use the options --force-zero-initialized-variables.

hugoam commented 4 years ago

I would, unfortunately, need to reopen this issue. It turns out, the D3D compiler really cannot deal properly with this switch idiom. The uninitialized variables analysis, as it turns out, was just the tip of the iceberg. Here is an example of a shader compiled with SPIRV-cross, which then completely fails to compile properly with FXC: http://shader-playground.timjones.io/47c92f12206b67fd44e46b0e3fa228e4 Once more, this is an issue of FXC, intrinsically, but I'm again not quite sure how this can realistically get solved there in any reasonable time frame ? The breakage in the compiler pipeline is still introduced by this SPIRV-tools change, sadly.

KhronosGroup / SPIRV-Tools

Problematic `switch(0u): default` branch idiom in merge return pass #3244