KhronosGroup / SPIRV-Cross

SPIRV-Cross is a practical tool and library for performing reflection on SPIR-V and disassembling SPIR-V back to high level languages.
Apache License 2.0
2k stars 555 forks source link

Texture reads are duplicated in MSL output when used multiple times #787

Closed zeux closed 5 years ago

zeux commented 5 years ago

I'm not sure whether this affects performance - if the Metal compiler does LICM perfectly then it shouldn't - but it definitely affects readability and I don't feel safe trusting that LICM for specifically texture reads works correctly :)

On a simple shader like this:

Texture2D<int4> tex;

struct Foo
{
    float4 a;
    float4 b;
};

cbuffer buf
{
    Foo results[16];
};

float4 main(float4 pos: SV_Position): SV_Target0
{
    int4 coords = tex.Load(pos.xy);
    Foo foo = results[coords.x % 16];

    return foo.a + foo.b;
}

Compiling the shader as follows:

glslangvalidator -S frag -D -V -e main test.hlsl
spirv-cross --msl frag.spv 

I get the following output:

#include <metal_stdlib>
#include <simd/simd.h>

using namespace metal;

struct Foo
{
    float4 a;
    float4 b;
};

struct buf
{
    Foo results[16];
};

struct main0_out
{
    float4 _entryPointOutput [[color(0)]];
};

fragment main0_out main0(constant buf& _43 [[buffer(0)]], texture2d<int> tex [[texture(0)]], float4 gl_FragCoord [[position]])
{
    main0_out out = {};
    out._entryPointOutput = _43.results[tex.read(uint2(int2(gl_FragCoord.xy).x), int2(gl_FragCoord.xy).y).x % 16].a + _43.results[tex.read(uint2(int2(gl_FragCoord.xy).x), int2(gl_FragCoord.xy).y).x % 16]
.b;
    return out;
}

Instead of this I'm expecting that tex.read results will be stored into a local variable.

HansKristian-Work commented 5 years ago

Ok, reproduced. I believe this is because a use of OpAccessChain does not propagate an expression use to its dependent expressions.

HansKristian-Work commented 5 years ago

On deeper analysis it's more complicated than that. This affects OpLoad as well. I'm working on adding a more robust expression tracking flow for cases like Load/AccessChain where subexpressions are embedded into another expression and we need to track expression reads independently.

HansKristian-Work commented 5 years ago

Should be fixed now. If you still see similar cases for more complex shaders, please create a new issue.