But the more I think about it, the more it looks like it's an llvm bug, not a mesa bug.
The shader (see attachment for TGSI, llvm and generated asm, buggy vs non-buggy shader) is a shader with a loop that iterates over lights, and stops when a specific constant relative to the light is set to 1 [32+7+8*num of the light].w
With git mesa and llvm 5.0.1, the shader hangs, but not with mesa 17.2. I first thought it was related to mesa in particular, but it is more likely the difference in the llvm asm triggers the buggy llvm behaviour.
When you compare the faulty shader, you'll see in the generated shader at the beginning of the loop:
Whereas the non-buggy shader adds the 8 (because we iterate over the lights) at the end of the loop.
According to the s_branch encoding, it should do PC = PC + 4
It is unclear to me where we end up with this jump, but my guess is either this ends up on the v_add, and thus the branch is noop - and this is bad because we miss the first light -, or maybe (the instruction is 8 bytes long) we end up in the middle of the instruction at execute 41000000 as if it was an instruction and not a constant.
Extended Description
This bug was previously entered here: https://bugs.freedesktop.org/show_bug.cgi?id=105442
But the more I think about it, the more it looks like it's an llvm bug, not a mesa bug.
The shader (see attachment for TGSI, llvm and generated asm, buggy vs non-buggy shader) is a shader with a loop that iterates over lights, and stops when a specific constant relative to the light is set to 1 [32+7+8*num of the light].w
With git mesa and llvm 5.0.1, the shader hangs, but not with mesa 17.2. I first thought it was related to mesa in particular, but it is more likely the difference in the llvm asm triggers the buggy llvm behaviour.
When you compare the faulty shader, you'll see in the generated shader at the beginning of the loop:
s_branch BB0_2 ; BF820000 v_add_f32_e32 v20, 0x41000000, v20 ; 022828FF 41000000
Whereas the non-buggy shader adds the 8 (because we iterate over the lights) at the end of the loop.
According to the s_branch encoding, it should do PC = PC + 4
It is unclear to me where we end up with this jump, but my guess is either this ends up on the v_add, and thus the branch is noop - and this is bad because we miss the first light -, or maybe (the instruction is 8 bytes long) we end up in the middle of the instruction at execute 41000000 as if it was an instruction and not a constant.