llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.9k stars 11.94k forks source link

[AMDGPU] Hang with gallium nine lighting shader likely caused by bad compilation #36052

Closed llvmbot closed 6 years ago

llvmbot commented 6 years ago
Bugzilla Link 36704
Resolution INVALID
Resolved on Apr 02, 2018 09:54
Version 5.0
OS Linux
Attachments tgsi, llvm and asm
Reporter LLVM Bugzilla Contributor

Extended Description

This bug was previously entered here: https://bugs.freedesktop.org/show_bug.cgi?id=105442

But the more I think about it, the more it looks like it's an llvm bug, not a mesa bug.

The shader (see attachment for TGSI, llvm and generated asm, buggy vs non-buggy shader) is a shader with a loop that iterates over lights, and stops when a specific constant relative to the light is set to 1 [32+7+8*num of the light].w

With git mesa and llvm 5.0.1, the shader hangs, but not with mesa 17.2. I first thought it was related to mesa in particular, but it is more likely the difference in the llvm asm triggers the buggy llvm behaviour.

When you compare the faulty shader, you'll see in the generated shader at the beginning of the loop:

s_branch BB0_2 ; BF820000 v_add_f32_e32 v20, 0x41000000, v20 ; 022828FF 41000000

Whereas the non-buggy shader adds the 8 (because we iterate over the lights) at the end of the loop.

According to the s_branch encoding, it should do PC = PC + 4

It is unclear to me where we end up with this jump, but my guess is either this ends up on the v_add, and thus the branch is noop - and this is bad because we miss the first light -, or maybe (the instruction is 8 bytes long) we end up in the middle of the instruction at execute 41000000 as if it was an instruction and not a constant.

llvmbot commented 6 years ago

The bug was a mesa bug, not a llvm bug. More details on the mesa bug report.