Closed hikiko closed 6 years ago
I did some digging into this and I think it might be caused by reusing symbol IDs across compilation units. It looks like multiple TSymbolTable
s are created (perhaps one per compilation unit) each with their own counter for the uniqueId
that gets used to create symbols. These IDs are then mapped to ResultID
s via TGlslangToSpvTraverser::symbolValues
. If the same ID is used twice then it will end up generating code that refers to the wrong ResultID.
I made a hacky proof-of-concept patch here to change TSymbolTable::uniqueId
to be a static variable and it fixes hikiko’s example.
I am running into the same issue with this Vulkan example converted from a Piglit test here. The above patch fixes both tests.
Reversing the two lines of declarations should not make or break the SPIR-V. I'll look into it and see why such an odd thing would be happening.
However, multiple compilation units for the same stage to generate SPIR-V is certainly under-tested/supported. I will also check the level of support for doing this. I know not all the semantics are properly error checked, but it does have the basics in place to link together the two compilation units.
Presumably changing the order of the two variables makes them have different symbol IDs so they will collide with different things when the symbol IDs are reused in the other compilation unit. It’s plausible this would either break or fix the generated code.
Yes, @bpeel I see your point. It does seem entirely possible that unique ids are being reused across compilation units, as that would not affect the "validator" aspect of glslang, only code gen.
Rechecking this fix, the failing case included at the issue description works fine now with master. Doing git bisect, it got solved with the following commit:
commit 41436ad2042af1ade8a415dd4f23bb3aefd26aa0 Author: John Kessenich <cepheus@frii.com> Date: Fri Jul 13 10:40:40 2018 -0600 Link/SPV: Correct symbol IDs on merging ASTs to a single coherent space
Although I didn't check for more complex cases similar to the ones reported, the truth is that the specific case showed is fixed. So I think that it would be safe/the better to just close the issue.
Thank you, I just closed the issue.
How to reproduce:
First use this couple of pixel shaders (error-free SPIR-V case):
Your program should have no issues. Now, change the order of the lines:
to become:
in the first pixel shader and the program will crash.
The reason, is the difference in the generated SPIR-V code: SPIR-V in the 1st case
SPIR-V in the 2nd case
Differences I spotted:
OpStore %out_color %25
OpReturn
OpFunctionEnd
OpStore %a %23
OpReturn
OpFunctionEnd
The program with the 2nd shader crashes on mesa and nvidia at linking, the mesa error is:
I've noticed similar errors on mesa when there's some sort of input in the first shader, for example when a uniform is bound, or when we use varyings, so maybe these cases generate problematic entry points as well.
To try this code and see the crash quickly on linux try this example: https://github.com/hikiko/glquad-spirv
First run make and ./glquad You must see a quad. Then open the file: data/test.f.glsl and change the order of the 2 lines mentioned above. Run make clean to clean the previous spirv. Then run make and ./glquad again and you will get the crash/abort/segfault.