microsoft / DirectXShaderCompiler

This repo hosts the source for the DirectX Shader Compiler which is based on LLVM/Clang.
Other
3.04k stars 677 forks source link

Various `RWByteAddressBuffer` interlocked intrinsics do not properly select signed overload #5680

Open jeremyong opened 1 year ago

jeremyong commented 1 year ago

Description

First, the documentation for the various interlocked operations on RWByteAddressBuffer types erroneously describe a single overload only accepting UINT value and original parameters (in reality, signed INT parameters work also). The documentation later does say:

This operation can only be performed on int and uint typed resources and shared memory variables.

However, I believe there is an issue at the moment that results in the wrong unsigned opcode generated, where a signed opcode was expected instead when literals are used.

Steps to Reproduce

Here's an example snippet on CE, reproduced below, as demonstrated specifically for InterlockedMin although other ops exhibit the same issue:

[numthreads(1, 1, 1)]
void main()
{
    int imin = -3;
    uint umin = 3;

    int ioriginal;
    int uoriginal;

    // AtomicBinOp: 
    // i32 - opcode
    // %dx.types.Handle - resource handle
    // i32 - binary op code (imin is 4, umin is 6)
    // i32 - coordinate c0
    // i32 - coordinate c1
    // i32 - coordinate c2
    // i32 - new value

    // Offsets are just to make the op in the IL easier to find
    RWByteAddressBuffer b0 = ResourceDescriptorHeap[0];
    // Uses opcode 6 (umin) despite signed int arguments
    b0.InterlockedMin(12, 3, ioriginal);
    // Uses opcode 4 (imin) because somehow, the sign affected the interpretation of the literal
    b0.InterlockedMin(16, -3, ioriginal);
    // Uses opcode 6 (umin) as expected
    b0.InterlockedMin(20, umin, uoriginal);
    // Uses opcode 4 (imin) as expected
    b0.InterlockedMin(24, imin, ioriginal);
}

Note the problematic line that supplies 3 as the value and ioriginal as the third argument results in an unsigned min.

I also noticed that mixed types are permissible for the second and third arguments, which would preferably result in a compiler error instead IMO.

Environment

llvm-beanz commented 1 year ago

This looks awful, but is the expected behavior in HLSL. In C all literals are signed unless explicitly specified to be unsigned. In HLSL, literals resolve to unsigned unless there is a sign specifier. This unfortunate since neither HLSL or C have a literal suffix to specify signed literals, so the only think you can do to work around it is cast the literal to int.

Fixing the way literals are resolved would have much wider impact on the language than just this use case, so we can't do it without a language version change. I would like to change our literals to match C rules, and we will get there someday.

llvm-beanz commented 1 year ago

I should also note, that parameter resolution is left-to-right on functions. During overload resolution we select two possible overloads (singed and unsigned), both overloads require casting one argument so they should end up being equally scored. First one should get a score bump (left-to-right).

Adopting C++ overload resolution could also help in cases like this.

llvm-beanz commented 1 year ago

I'm re-opening this after some internal discussion. Our ASTs are kinda awful lies in these cases and it causes unexpected (and unwanted) casts to occur. We should fix the AST. That may not resolve all the issues here, but it might help.

llvm-beanz commented 1 year ago

Tagged as hlsl-next because this also might overlap with some of the planned future changes to overload resolution and literal types.

jeremyong commented 1 year ago

Thanks I did confirm that this works as expected:

    b0.InterlockedMin(12, (int)3, ioriginal);

Will need to remember that for now moving forward.

jeremyong commented 1 year ago

@llvm-beanz Instead of changing the default type for literals, would another reasonable smaller fix be to enforce that the types of value and original in the interlocked intrinsics match? That way, we'd see a compiler error instead of resorting to the awkward cast-scoring ADL behavior.