Sergio0694 / ComputeSharp

A .NET library to run C# code in parallel on the GPU through DX12, D2D1, and dynamically generated HLSL compute and pixel shaders, with the goal of making GPU computing easy to use for all .NET developers! 🚀
MIT License
2.65k stars 122 forks source link

FXC/D3DCompile() bug causes float literals to be emitted as the wrong value (CS.D2D1) #780

Closed rickbrew closed 1 month ago

rickbrew commented 4 months ago

This appears to be a bug in how fxc / D3DCompile is parsing float literals, but the workaround seems easy enough. Link to ridiculously long Discord conversation: https://discord.com/channels/590611987420020747/996417435374714920/1216522083626913922

tl;dr: Float literals should always be emitted as asfloat(uint_representation_of_float_value) instead of as actual float literals. This works around a bug in the shader compiler. (Alternately, they can be emitted as double-with-cast, e.g. (float)1234.56L. The shader compiler is smart enough to emit it directly w/o actually using doubles. edit: but only when optimizations are enabled!)

Debugging this consumed my entire day, and is probably causing all sorts of small errors in many shaders that everyone has for CS.D2D1. My guess is that regular ComputeSharp(non-D2D1) is not affected since it doesn't use fxc / D3DCompile().

Consider these two shaders, which are identical except that the x,y of the return value is swapped:

[D2DInputCount(0)]
[D2DGeneratedPixelShaderDescriptor]
internal readonly partial struct BadShader1
    : ID2D1PixelShader
{
    public float4 Execute()
    {
        return new float4(131072.65f, (float)131072.65, 0.0f, 1.0f);
    }
}

[D2DInputCount(0)]
[D2DGeneratedPixelShaderDescriptor]
internal readonly partial struct BadShader2
    : ID2D1PixelShader
{
    public float4 Execute()
    {
        return new float4((float)131072.65, 131072.65f, 0.0f, 1.0f);
    }
}

The HLSL that is generated is fine:

        /// <inheritdoc/>
        [global::System.CodeDom.Compiler.GeneratedCode("ComputeSharp.D2D1.D2DPixelShaderDescriptorGenerator", "3.0.0.0")]
        [global::System.Diagnostics.DebuggerNonUserCode]
        [global::System.Diagnostics.CodeAnalysis.ExcludeFromCodeCoverage]
        static string global::ComputeSharp.D2D1.Descriptors.ID2D1PixelShaderDescriptor<BadShader1>.HlslSource =>
            """
            #define D2D_INPUT_COUNT 0

            #include "d2d1effecthelpers.hlsli"

            D2D_PS_ENTRY(Execute)
            {
                return float4(131072.66, (float)131072.65L, 0.0, 1.0);
            }
            """;
...
        /// <inheritdoc/>
        [global::System.CodeDom.Compiler.GeneratedCode("ComputeSharp.D2D1.D2DPixelShaderDescriptorGenerator", "3.0.0.0")]
        [global::System.Diagnostics.DebuggerNonUserCode]
        [global::System.Diagnostics.CodeAnalysis.ExcludeFromCodeCoverage]
        static string global::ComputeSharp.D2D1.Descriptors.ID2D1PixelShaderDescriptor<BadShader2>.HlslSource =>
            """
            #define D2D_INPUT_COUNT 0

            #include "d2d1effecthelpers.hlsli"

            D2D_PS_ENTRY(Execute)
            {
                return float4((float)131072.65L, 131072.66, 0.0, 1.0);
            }
            """;

When running these shaders and reading them back from the CPU, the values seem to be wrong for the float literal (the X value from the first shader, or the Y value from the second shader)

image

The float literal is roundtripping as 131072.703125 instead of 131072.656250. The double-cast-to-float is fine (which the shader compiler emits directly without actually using doubles).

Not shown here is that Hlsl.AsFloat(1207959594U) also works fine (1207959594U being 131072.65 bit-cast to a uint).

I was able to determine that the bytecode is actually different, and that the value emitted by D3DCompile() is just wrong: https://discord.com/channels/590611987420020747/996417435374714920/1216542121297973369 image

So: 1) The float literal is bad. The shader compile emits the wrong value into the bytecode (131072.703125). 2) The double literal cast to float is fine. The shader compiler emits the correct value (131072.656250), and does not actually use double precision instructions. I don't know if this is a 100% guarantee though, it's just what happened with this particular code. 3) The bit-cast from uint to float is fine. The shader compiler emits the correct value (131072.656250). (sorry, it's not in the screenshots, there's just too much to juggle here and I don't want to go recreate all the screenshots etc.)