Open jaebaek opened 5 years ago
It's impossible to tell if there is a performance difference just by staring at the code. The driver's shader compiler will process the code, and may collapse those insertions anyway. The sequence of inserts in the unoptimized code above is actually typical for code coming from an LLVM-based compiler.
We should only do this transformation if we have evidence it actually does affect performance, or if there is another compelling reason to do so.
It turns out that, to work around a driver issue, I wrote an LLVM pass in Clspv to do this kind of rewriting. See https://github.com/google/clspv/blob/master/lib/RewriteInsertsPass.cpp#L265 But I did not do it for compactness or performance.
Vec4 Test_VS(VtxInput vtx, out MatrixH3 mtrx:MTRX):SV_Position
{
mtrx[0]=vtx.nrm();
mtrx[1]=vtx.tan();
mtrx[2]=vtx.bin(0, 0);
return 0;
}
When converting to GLSL results in extremely unoptimized code
mat3 _43;
void main()
{
mat3 _45 = _43;
_45[0] = ATTR2;
mat3 _46 = _45;
_46[1] = ATTR3.xyz;
mediump vec3 _23 = cross(vec3(0.0), vec3(0.0)) * ATTR3.w;
mat3 _47 = _46;
_47[2] = vec3(_23.x, _23.y, _23.z);
gl_Position = vec4(0.0);
IO0 = _47;
}
Problem:
in HLSL I setup only 1 matrix,
on GLSL modifying it results in a new matrix object being created every time.
Also it's initialized at the start from an invalid value mat3 _45 = _43;
Thanks for following up @GregSlazinski . To clarify, as @dneto0 mentioned above, performance improvements to spirv-opt are made with the goal of producing SPIR-V with better performance on the drivers that run SPIR-V. No guarantees can be made about how other tools (I assume SPIRV-Cross in this case) will translate that SPIR-V into higher level languages. When compiling down to a low-level language and disassembling back up to a high-level language, it's very common to produce more verbose code, but that isn't evidence in itself that the compiled code would have had poor performance.
This issue was originally reported as a DXC issue.
When we compile the following HLSL with
dxc -Tvs_6_0 -Emain -spirv
,it generates the following SPIR-V (which is the result of spirv-opt):
Based on what SPIR-V spec says, "
OpCompositeInsert
Make a copy of a composite object, while modifying one part of it.". In my opinion, we can change the following partto
may have the better performance. What do you think?
not_optimized.txt is the one generated by DXC without optimization i.e.,
dxc -Tvs_6_0 -Emain -spirv -fcgl
.