Open dmpots opened 1 year ago
This is a bug we experienced since 2019. Our workaround is to have a second conditional in the loop with a second WaveReadLaneFirst around the break.
Here is something that should work and unblock you in the meantime.
while (!IsHelperLane()) // support for any shader type
{
if (a == WaveReadLaneFirst(a))
{
// it is necessary to read from the first lane
// many times repeatedly like this to prevent
// the compiler from combining the 2 conditionals
// which leads to the same problematic result
int u = WaveReadLaneFirst(a);
... your code ...
}
if (a == WaveReadLaneFirst(a))
{
break;
}
}
On the net the problem has been mentionned multiple times.
Here under the "The classic subgroup breakage – maximal convergence gone awry" section: https://themaister.net/blog/2022/04/24/my-personal-hell-of-translating-dxil-to-spir-v
Or here where we see the need for a second branch: http://shader-playground.timjones.io/2fc43f77f626c4883e1b1ed23fd8b2e1
The problem has also been reported here many times, but somehow I can find the issues with a simple search right now.
I would also like to see some movement on this because this bug has been a problem for 4 years already.
Try adding a barrier for workaround?
This is a bug we experienced since 2019. Our workaround is to have a second conditional in the loop with a second WaveReadLaneFirst around the break.
@simondeschenes
I believe this case should be fixed with the introduction of the dx.break
in #2795 (and improved with later PRs). If I try putting the break in the loop for your example, it does keep the break in the body of the loop and guards the exit with the additional dx.break
comparison.
https://shader-playground.timjones.io/54c25778c86a04519b8cb2f89e2f59bb
If you have cases where things are incorrectly moved out of a loop it would be great to file a bug. I think the old bugs are considered closed by the previously mentioned PR.
The bug I am reporting here is that the same dx.break
mechanism does not work for VS. The fix seems to currently be restricted to PS and CS.
Ok, I see dx.break.cond code in the shader playground which should be correct. My failing cases also seem to all be in the vertex shaders now. It is nice to know there is a working fix for the compute and pixel shaders passes. I just assumed that if our vertex shader was still broken it was still broken everywhere.
The workaround I posted should still allow you to have something that works in the Vertex Shader until the vertex shader is also fixed.
I don't have the time/rights/patience to go through PR and creating new tests to fix the bug. However, I think I found it and someone else could fix it and make a pull request.
In tools/clang/lib/CodeGen/CGHLSLMS.cpp
, in function EmitHLSLCondBreak
, around line 4660:
if (!m_pHLModule->GetShaderModel()->IsPS() && !m_pHLModule->GetShaderModel()->IsCS() &&
!m_pHLModule->GetShaderModel()->IsLib()) {
return CGF.Builder.CreateBr(DestBB);
}
I don't know why it is limited to LIB, PS and CS. However, this is what seems to prevent the fix from applying to vertex, geometry, hull, domain, ray, mesh and amplification shaders.
If I write a waterfall loop to implement my own version of NonUniformResourceIndex
It works fine if I compile the shader as a PS (
dxc /Tps_6_0 .\break.hlsl /DOUTPUT=SV_Target
). The buffer load is kept in the body of the loop using thedx.break
mechanism.However, if I compile the shader as a VS (
dxc /Tvs_6_0 .\break.hlsl /DOUTPUT=Z
) the buffer load is moved outside the loop body.I don't see any reason why the VS codegen should be different here.
The
dx.break
mechanism was added in #2795 to keep wave-sensitive values in a loop.