GLSL->SPIR-V: Changing the order of input variables generates invalid entry point with empty output variable name and no location.

hikiko commented 6 years ago

How to reproduce:

First use this couple of pixel shaders (error-free SPIR-V case):

[fragment shader]
#version 450 
layout(location = 0) out vec4 out_color;
const vec4 color = vec4(0.2, 0.8, 0.0, 1.0); 
vec4 ambient(float a, vec4 color);
void main()
{
        out_color = ambient(0.1, color);
}

[fragment shader]
#version 450
vec4 ambient(float a, vec4 color)
{
        return vec4(color.r + a, color.g + a, color.b + a, color.a);
}

Your program should have no issues. Now, change the order of the lines:

layout(location = 0) out vec4 out_color;
const vec4 color = vec4(0.2, 0.8, 0.0, 1.0);

to become:

const vec4 color = vec4(0.2, 0.8, 0.0, 1.0); 
layout(location = 0) out vec4 out_color;

in the first pixel shader and the program will crash.

The reason, is the difference in the generated SPIR-V code: SPIR-V in the 1st case

[fragment shader spirv]
; Automatically generated from the GLSL by shader_test_spirv.py. DO NOT EDIT
; SPIR-V
; Version: 1.0
; Generator: Khronos Glslang Reference Front End; 5
; Bound: 48
; Schema: 0
               OpCapability Shader
          %1 = OpExtInstImport "GLSL.std.450"
               OpMemoryModel Logical GLSL450
               OpEntryPoint Fragment %main "main" %out_color
               OpExecutionMode %main OriginLowerLeft
               OpSource GLSL 450
               OpName %main "main"
               OpName %ambient_f1_vf4_ "ambient(f1;vf4;"
               OpName %a "a"
               OpName %color "color"
               OpName %out_color "out_color"
               OpName %param "param"
               OpName %param_0 "param"
               OpDecorate %out_color Location 0
       %void = OpTypeVoid
          %3 = OpTypeFunction %void
      %float = OpTypeFloat 32
%_ptr_Function_float = OpTypePointer Function %float
    %v4float = OpTypeVector %float 4
%_ptr_Function_v4float = OpTypePointer Function %v4float
         %10 = OpTypeFunction %v4float %_ptr_Function_float %_ptr_Function_v4float
%_ptr_Output_v4float = OpTypePointer Output %v4float
  %out_color = OpVariable %_ptr_Output_v4float Output
  %float_0_1 = OpConstant %float 0.1
  %float_0_2 = OpConstant %float 0.2
  %float_0_8 = OpConstant %float 0.8
    %float_0 = OpConstant %float 0
    %float_1 = OpConstant %float 1
         %22 = OpConstantComposite %v4float %float_0_2 %float_0_8 %float_0 %float_1
       %uint = OpTypeInt 32 0
     %uint_0 = OpConstant %uint 0
     %uint_1 = OpConstant %uint 1
     %uint_2 = OpConstant %uint 2
     %uint_3 = OpConstant %uint 3
       %main = OpFunction %void None %3
          %5 = OpLabel
      %param = OpVariable %_ptr_Function_float Function
    %param_0 = OpVariable %_ptr_Function_v4float Function
               OpStore %param %float_0_1
               OpStore %param_0 %22
         %25 = OpFunctionCall %v4float %ambient_f1_vf4_ %param %param_0
               OpStore %out_color %25
               OpReturn
               OpFunctionEnd

%ambient_f1_vf4_ = OpFunction %v4float None %10
          %a = OpFunctionParameter %_ptr_Function_float
      %color = OpFunctionParameter %_ptr_Function_v4float
         %14 = OpLabel
         %28 = OpAccessChain %_ptr_Function_float %color %uint_0
         %29 = OpLoad %float %28
         %30 = OpLoad %float %a
         %31 = OpFAdd %float %29 %30
         %33 = OpAccessChain %_ptr_Function_float %color %uint_1
         %34 = OpLoad %float %33
         %35 = OpLoad %float %a
         %36 = OpFAdd %float %34 %35
         %38 = OpAccessChain %_ptr_Function_float %color %uint_2
         %39 = OpLoad %float %38
         %40 = OpLoad %float %a
         %41 = OpFAdd %float %39 %40
         %43 = OpAccessChain %_ptr_Function_float %color %uint_3
         %44 = OpLoad %float %43
         %45 = OpCompositeConstruct %v4float %31 %36 %41 %44
               OpReturnValue %45
               OpFunctionEnd

SPIR-V in the 2nd case

; Automatically generated from the GLSL by shader_test_spirv.py. DO NOT EDIT
; SPIR-V
; Version: 1.0
; Generator: Khronos Glslang Reference Front End; 5
; Bound: 46
; Schema: 0
               OpCapability Shader
          %1 = OpExtInstImport "GLSL.std.450"
               OpMemoryModel Logical GLSL450
               OpEntryPoint Fragment %main "main"
               OpExecutionMode %main OriginLowerLeft
               OpSource GLSL 450
               OpName %main "main"
               OpName %ambient_f1_vf4_ "ambient(f1;vf4;"
               OpName %a "a"
               OpName %color "color"
               OpName %param "param"
               OpName %param_0 "param"
       %void = OpTypeVoid
          %3 = OpTypeFunction %void
      %float = OpTypeFloat 32
%_ptr_Function_float = OpTypePointer Function %float
    %v4float = OpTypeVector %float 4
%_ptr_Function_v4float = OpTypePointer Function %v4float
         %10 = OpTypeFunction %v4float %_ptr_Function_float %_ptr_Function_v4float
  %float_0_1 = OpConstant %float 0.1
  %float_0_2 = OpConstant %float 0.2
  %float_0_8 = OpConstant %float 0.8
    %float_0 = OpConstant %float 0
    %float_1 = OpConstant %float 1
         %20 = OpConstantComposite %v4float %float_0_2 %float_0_8 %float_0 %float_1
       %uint = OpTypeInt 32 0
     %uint_0 = OpConstant %uint 0
     %uint_1 = OpConstant %uint 1
     %uint_2 = OpConstant %uint 2
     %uint_3 = OpConstant %uint 3
       %main = OpFunction %void None %3
          %5 = OpLabel
      %param = OpVariable %_ptr_Function_float Function
    %param_0 = OpVariable %_ptr_Function_v4float Function
               OpStore %param %float_0_1
               OpStore %param_0 %20
         %23 = OpFunctionCall %v4float %ambient_f1_vf4_ %param %param_0
               OpStore %a %23
               OpReturn
               OpFunctionEnd

%ambient_f1_vf4_ = OpFunction %v4float None %10
          %a = OpFunctionParameter %_ptr_Function_float
      %color = OpFunctionParameter %_ptr_Function_v4float
         %14 = OpLabel
         %26 = OpAccessChain %_ptr_Function_float %color %uint_0
         %27 = OpLoad %float %26
         %28 = OpLoad %float %a
         %29 = OpFAdd %float %27 %28
         %31 = OpAccessChain %_ptr_Function_float %color %uint_1
         %32 = OpLoad %float %31
         %33 = OpLoad %float %a
         %34 = OpFAdd %float %32 %33
         %36 = OpAccessChain %_ptr_Function_float %color %uint_2
         %37 = OpLoad %float %36
         %38 = OpLoad %float %a
         %39 = OpFAdd %float %37 %38
         %41 = OpAccessChain %_ptr_Function_float %color %uint_3
         %42 = OpLoad %float %41
         %43 = OpCompositeConstruct %v4float %29 %34 %39 %42
               OpReturnValue %43
               OpFunctionEnd

Differences I spotted:

First case	Second case
'''OpEntryPoint Fragment %main "main" %out_color'''	'''OpEntryPoint Fragment %main "main"'''
OpName %out_color "out_color"	missing
OpDecorate %out_color Location 0	missing
The output
%25 = OpFunctionCall %v4float %ambient_f1_vf4_ %param %param_0 OpStore %out_color %25 OpReturn OpFunctionEnd	%23 = OpFunctionCall %v4float %ambient_f1_vf4_ %param %param_0 OpStore %a %23 OpReturn OpFunctionEnd

The program with the 2nd shader crashes on mesa and nvidia at linking, the mesa error is:

SPIR-V parsing FAILED: In file spirv/vtn_variables.c:1971 Source and destination types of SpvOpStore do not match: float vs. vec4 1344 bytes into the SPIR-V binary shader_runner: main/glspirv.c:228: _mesa_spirv_to_nir: Assertion `entry_point' failed.

I've noticed similar errors on mesa when there's some sort of input in the first shader, for example when a uniform is bound, or when we use varyings, so maybe these cases generate problematic entry points as well.

To try this code and see the crash quickly on linux try this example: https://github.com/hikiko/glquad-spirv

First run make and ./glquad You must see a quad. Then open the file: data/test.f.glsl and change the order of the 2 lines mentioned above. Run make clean to clean the previous spirv. Then run make and ./glquad again and you will get the crash/abort/segfault.

bpeel commented 6 years ago

I did some digging into this and I think it might be caused by reusing symbol IDs across compilation units. It looks like multiple TSymbolTables are created (perhaps one per compilation unit) each with their own counter for the uniqueId that gets used to create symbols. These IDs are then mapped to ResultIDs via TGlslangToSpvTraverser::symbolValues. If the same ID is used twice then it will end up generating code that refers to the wrong ResultID.

I made a hacky proof-of-concept patch here to change TSymbolTable::uniqueId to be a static variable and it fixes hikiko’s example.

I am running into the same issue with this Vulkan example converted from a Piglit test here. The above patch fixes both tests.

johnkslang commented 6 years ago

Reversing the two lines of declarations should not make or break the SPIR-V. I'll look into it and see why such an odd thing would be happening.

However, multiple compilation units for the same stage to generate SPIR-V is certainly under-tested/supported. I will also check the level of support for doing this. I know not all the semantics are properly error checked, but it does have the basics in place to link together the two compilation units.

bpeel commented 6 years ago

Presumably changing the order of the two variables makes them have different symbol IDs so they will collide with different things when the symbol IDs are reused in the other compilation unit. It’s plausible this would either break or fix the generated code.

johnkslang commented 6 years ago

Yes, @bpeel I see your point. It does seem entirely possible that unique ids are being reused across compilation units, as that would not affect the "validator" aspect of glslang, only code gen.

infapi00 commented 6 years ago

Rechecking this fix, the failing case included at the issue description works fine now with master. Doing git bisect, it got solved with the following commit:

        commit 41436ad2042af1ade8a415dd4f23bb3aefd26aa0
        Author: John Kessenich <cepheus@frii.com>
        Date:   Fri Jul 13 10:40:40 2018 -0600

            Link/SPV: Correct symbol IDs on merging ASTs to a single coherent space

Although I didn't check for more complex cases similar to the ones reported, the truth is that the specific case showed is fixed. So I think that it would be safe/the better to just close the issue.

hikiko commented 6 years ago

Thank you, I just closed the issue.

KhronosGroup / glslang

GLSL->SPIR-V: Changing the order of input variables generates invalid entry point with empty output variable name and no location. #1285