gfx-rs / wgpu

A cross-platform, safe, pure-Rust graphics API.
https://wgpu.rs
Apache License 2.0
12.2k stars 897 forks source link

Non uniform indices are not always properly handled by naga #6270

Open hasenbanck opened 1 week ago

hasenbanck commented 1 week ago

Description We had an off error that only occured on AMD GPUs (Vega, RDNA1 and RDNA2), where under Vulkan and DX12 textures where rendered with the wrong texel information (see attached screenshot). We use the "SAMPLED_TEXTURE_AND_STORAGE_BUFFER_ARRAY_NON_UNIFORM_INDEXING" feature to select textures based on non-uniform vertex data inside the fragment shader. We found out, that the AMD devices sometimes used the wrong texture for some texels. We debugged the problem under Vulkan and saw, that sometimes in the SPIRV output the indexes used for indexing the binding array witht the texture where not properly markes as non-uniform indexes.

We traces the problem back to naga, which will not emmit the necessary marker for non-unified indexes when using fragment shader with no wrapper struct for the function parameter. We reproduces the problem using the texture_arrays example.

Repro steps Use the current trunk branch (eb18854b46a4d6c973a1981be6739ec19282f2f3) and remove the wrapper struct for the fragment input:

@group(0) @binding(0)
var texture_array_top: binding_array<texture_2d<f32>>;
@group(0) @binding(1)
var texture_array_bottom: binding_array<texture_2d<f32>>;
@group(0) @binding(2)
var sampler_array: binding_array<sampler>;

@fragment
fn non_uniform_main(
    @location(0) tex_coord: vec2<f32>,
    @location(1) index: i32,
) -> @location(0) vec4<f32> {
    var outval: vec3<f32>;
    if tex_coord.y <= 0.5 {
        outval = textureSampleLevel(
            texture_array_top[index],
            sampler_array[index],
            tex_coord,
            0.0
        ).rgb;
    } else {
        outval = textureSampleLevel(
            texture_array_bottom[index],
            sampler_array[index],
            tex_coord,
            0.0
        ).rgb;
    }

    return vec4<f32>(outval.x, outval.y, outval.z, 1.0);
}

This will result in SPIRV code which does not have the expected non-uniform index marker in it. It looks like this in HLSL when de-compiled with SPIRV cross:

void frag_main()
{
    float3 outval = 0.0f.xxx;
    if (tex_coord.y <= 0.5f)
    {
        outval = texture_array_top[index].SampleLevel(sampler_array[index], tex_coord, 0.0f).xyz;
    }
    else
    {
        outval = texture_array_bottom[index].SampleLevel(sampler_array[index], tex_coord, 0.0f).xyz;
    }
    _29 = float4(outval.x, outval.y, outval.z, 1.0f);
}

Normally the indexes should be marked as non-uniform indexes like (using the original texture_arrays code):

void frag_main()
{
    float3 outval = 0.0f.xxx;
    FragmentInput _24 = { tex_coord, index };
    if (_24.tex_coord.y <= 0.5f)
    {
        outval = texture_array_top[NonUniformResourceIndex(_24.index)].SampleLevel(sampler_array[NonUniformResourceIndex(_24.index)], _24.tex_coord, 0.0f).xyz;
    }
    else
    {
        outval = texture_array_bottom[NonUniformResourceIndex(_24.index)].SampleLevel(sampler_array[NonUniformResourceIndex(_24.index)], _24.tex_coord, 0.0f).xyz;
    }
    _31 = float4(outval.x, outval.y, outval.z, 1.0f);
}

It is also possible to skip the wgpu part and directly convert both WGSL shader versions with the help of naga-cli:

naga non_uniform_indexing.wgsl non_uniform_indexing.hlsl

Expected vs observed behavior Non-uniform indexes should properly marked as such in the resulting shader modules.

Extra materials XInRsc6

Platform WGPU stable (22.1) and WGPU trunk. Nvidia and Intel devices will accept the SPIRV modules and render them without any problem, but AMD devices on either Windows and Linux will render with artifacts (see screenshot).

magcius commented 6 days ago

The bug is that naga's analyzer considers flat interpolants (which integers are by default) to be uniform, which is just plain wrong. There is nothing that says that something flat across the triangle will be subgroup uniform. https://github.com/gfx-rs/wgpu/blob/eb18854b46a4d6c973a1981be6739ec19282f2f3/naga/src/valid/analyzer.rs#L604-L608

cwfitzgerald commented 6 days ago

That's a big oops!