microsoft / DirectXShaderCompiler

This repo hosts the source for the DirectX Shader Compiler which is based on LLVM/Clang.
Other
3.11k stars 698 forks source link

[SPIR-V] Specifying max number of loops for unroll attribute is ignored #6852

Open Fletterio opened 3 months ago

Fletterio commented 3 months ago

Description When unrolling a loop, if I pass a value to it to specify the max number of iterations the loop should execute (as specified here), the wrong SPIR-V gets generated

Steps to Reproduce See these godbolts: https://godbolt.org/z/8eb4crMvd https://godbolt.org/z/shTjrbToW

Actual Behavior In the first godbolt I specify a loop with a number of iterations given by an integer literal FOO_VALUE. I also add an attribute for it to unroll and specify a max number of MAX_ITER iterations. You can check by playing around that no matter what value you set for MAX_ITER, the loop WILL be executed FOO_VALUE times (you can see this in line 18 of the generated SPIR-V). The first godbolt helps showcase that the behaviour is wrong.

In the second godbolt I instead get the loop's end condition at runtime, so now the compiler doesn't run the loop at compile time and slap the result of the loop in the generated SPIR-V. You can see in line 26 of the generated SPIR-V the the only Loop Control Parameters generated for OpLoopMerge is Unroll.

I'd expect the parameters to also have MaxIterations followed by MAX_ITER

s-perron commented 3 months ago

For the first case where you have

    [unroll(5)]
    for(uint i = 0; i < 10; i++)
        foo++;

I suggest that DXC is generating is reasonable. The MaxIterations is suppose to tell us the maximum number iterations the loop will actually execute, and in this case it is not true. The loop conditons say it will execute 10 times, but the max is 5. The compiler can see that, so it is free (and reasonable) to ignore the hint.

I think this test case has undefined behaviour because the max number of iteration is wrong. If a compiler performed an optimization that changed the code assuming the max iterations is correct, the compiler would not have done anything wrong, but it would change the results.

For the second example, we could add the MaxIterations loop control if targeting vulkan1.2 or later. It requires spir-v 1.4. However, that is not required. Some optimizations might be missed in the driver, but it is still functionally correct.

The maintainers will not fix this in DXC. We will try to do it in clang. However, if someone else wants implement it, we would accept it.