KhronosGroup / MoltenVK

MoltenVK is a Vulkan Portability implementation. It layers a subset of the high-performance, industry-standard Vulkan graphics and compute API over Apple's Metal graphics framework, enabling Vulkan applications to run on macOS, iOS and tvOS.
Apache License 2.0
4.64k stars 402 forks source link

SPIR-V to MSL conversion error in MoltenVK but not when using spirv-cross command line #2180

Closed RefuX closed 1 month ago

RefuX commented 3 months ago

Disclaimer: Sorry for any incorrect assumptions, this is my first time trying to debug SPIR-V issues.

I'm having a go at getting multi_draw_indirect in Vulkan-Samples working, and I'm running into a issue where I'm getting this error: SPIR-V to MSL conversion error: Argument buffer resource base type could not be determined. When padding argument buffer elements, all descriptor set resources must be supplied with a base type by the app.

I turned on the logging in MoltenVK (MVK_CONFIG_DEBUG=1, MVK_CONFIG_LOG_LEVEL=4) and see I'm getting the error on the multi_draw_indirect.frag file:

[mvk-info] Compiling Metal shader with FastMath enabled.
[mvk-error] SPIR-V to MSL conversion error: Argument buffer resource base type could not be determined. When padding argument buffer elements, all descriptor set resources must be supplied with a base type by the app.
[mvk-info] Converting SPIR-V:
; SPIR-V
; Version: 1.0
; Generator: Khronos Glslang Reference Front End; 11
; Bound: 51
; Schema: 0
               OpCapability Shader
          %1 = OpExtInstImport "GLSL.std.450"
               OpMemoryModel Logical GLSL450
               OpEntryPoint Fragment %4 "main" %9 %18 %28
               OpExecutionMode %4 OriginUpperLeft
               OpSource GLSL 460
               OpName %4 "main"
               OpName %9 "o_color"
               OpName %16 "textures"
               OpName %18 "in_texture_index"
               OpName %28 "in_uv"
               OpDecorate %9 Location 0
               OpDecorate %16 DescriptorSet 0
               OpDecorate %16 Binding 1
               OpDecorate %18 Flat
               OpDecorate %18 Location 2
               OpDecorate %28 Location 1
          %2 = OpTypeVoid
          %3 = OpTypeFunction %2
          %6 = OpTypeFloat 32
          %7 = OpTypeVector %6 4
          %8 = OpTypePointer Output %7
          %9 = OpVariable %8 Output
         %10 = OpTypeImage %6 2D 0 0 0 1 Unknown
         %11 = OpTypeSampledImage %10
         %12 = OpTypeInt 32 0
         %13 = OpConstant %12 225
         %14 = OpTypeArray %11 %13
         %15 = OpTypePointer UniformConstant %14
         %16 = OpVariable %15 UniformConstant
         %17 = OpTypePointer Input %12
         %18 = OpVariable %17 Input
         %23 = OpTypePointer UniformConstant %11
         %26 = OpTypeVector %6 2
         %27 = OpTypePointer Input %26
         %28 = OpVariable %27 Input
         %36 = OpConstant %6 1.5
         %37 = OpTypeVector %6 3
         %41 = OpConstant %12 0
         %42 = OpTypePointer Output %6
         %45 = OpConstant %12 1
         %48 = OpConstant %12 2
          %4 = OpFunction %2 None %3
          %5 = OpLabel
         %19 = OpLoad %12 %18
         %20 = OpConvertUToF %6 %19
         %21 = OpExtInst %6 %1 Round %20
         %22 = OpConvertFToU %12 %21
         %24 = OpAccessChain %23 %16 %22
         %25 = OpLoad %11 %24
         %29 = OpLoad %26 %28
         %30 = OpImageSampleImplicitLod %7 %25 %29
         %31 = OpCompositeExtract %6 %30 0
         %32 = OpCompositeExtract %6 %30 1
         %33 = OpCompositeExtract %6 %30 2
         %34 = OpCompositeExtract %6 %30 3
         %35 = OpCompositeConstruct %7 %31 %32 %33 %34
               OpStore %9 %35
         %38 = OpLoad %7 %9
         %39 = OpVectorShuffle %37 %38 %38 0 1 2
         %40 = OpVectorTimesScalar %37 %39 %36
         %43 = OpAccessChain %42 %9 %41
         %44 = OpCompositeExtract %6 %40 0
               OpStore %43 %44
         %46 = OpAccessChain %42 %9 %45
         %47 = OpCompositeExtract %6 %40 1
               OpStore %46 %47
         %49 = OpAccessChain %42 %9 %48
         %50 = OpCompositeExtract %6 %40 2
               OpStore %49 %50
               OpReturn
               OpFunctionEnd

End SPIR-V

SPIR-V to MSL conversion error: Argument buffer resource base type could not be determined. When padding argument buffer elements, all descriptor set resources must be supplied with a base type by the app.

Partially converted MSL:

End MSL

Estimated original GLSL:
#version 460

layout(set = 0, binding = 1) uniform sampler2D textures[225];

layout(location = 0) out vec4 o_color;
layout(location = 2) flat in uint in_texture_index;
layout(location = 1) in vec2 in_uv;

void main()
{
    o_color = vec4(texture(textures[uint(round(float(in_texture_index)))], in_uv));
    vec4 _38 = o_color;
    vec3 _40 = _38.xyz * 1.5;
    o_color.x = _40.x;
    o_color.y = _40.y;
    o_color.z = _40.z;
}

End GLSL

So knowing very little about SPIR-V to MSL, but from reading some bugs on MoltenVK, it seems like often spriv-cross can be the issue. So I downloaded and built the spirv-cross project.

I then tried: glslangValidator -H -V -o multi_draw_indirect.spv multi_draw_indirect.frag ./spirv-cross --msl --msl-version 20100 --msl-argument-buffers multi_draw_indirect.spv --output multi_draw_indirect.msl

Which generates the msl file:

#include <metal_stdlib>
#include <simd/simd.h>

using namespace metal;

struct spvDescriptorSetBuffer0
{
    array<texture2d<float>, 225> textures [[id(0)]];
    array<sampler, 225> texturesSmplr [[id(225)]];
};

struct main0_out
{
    float4 o_color [[color(0)]];
};

struct main0_in
{
    float2 in_uv [[user(locn1)]];
    uint in_texture_index [[user(locn2)]];
};

fragment main0_out main0(main0_in in [[stage_in]], constant spvDescriptorSetBuffer0& spvDescriptorSet0 [[buffer(0)]])
{
    main0_out out = {};
    uint _22 = uint(round(float(in.in_texture_index)));
    out.o_color = float4(spvDescriptorSet0.textures[_22].sample(spvDescriptorSet0.texturesSmplr[_22], in.in_uv));
    float4 _38 = out.o_color;
    float3 _40 = _38.xyz * 1.5;
    out.o_color.x = _40.x;
    out.o_color.y = _40.y;
    out.o_color.z = _40.z;
    return out;
}

I'm using: MoltenVK version 1.2.7, supporting Vulkan version 1.2.275.

Thanks for any pointers.

billhollings commented 3 months ago

Thanks for reporting this, and thanks for trying it both ways, with MoltenVK in the picture, and with SPIRV-Cross alone.

Looks like either MoltenVK is not setting something correctly, or there is a bug in SPIRV-Cross in how the MoltenVK settings are handled.

stripe2933 commented 2 months ago

Hello, still suffering this issue in Vulkan 1.3.280 for using Metal argument buffer. Is there any progress about it? Thanks.

AlexanderDevaikinEnscape commented 2 months ago

+1 from me. I was stuck with MoltenVK 1.2.4 and Vulkan 1.3.261 for another reason, now when I can upgrade a few shaders fail to compile. Both 1.3.275 and 1.3.280 have this issue.

rcaridade145 commented 2 months ago

I notice that on MKVPipeline

shaderConfig.options.mslOptions.pad_argument_buffer_resources = useMetalArgBuff;

https://github.com/KhronosGroup/MoltenVK/blob/6c68ba1e0cef5ca9ecc177ade38cadf601108f55/MoltenVK/MoltenVK/GPUObjects/MVKPipeline.mm#L1732

On spirv-cross if that option is not set that warning is not shown

SRSaunders commented 2 months ago

1732 shaderConfig.options.mslOptions.pad_argument_buffer_resources = useMetalArgBuff;

I can confirm that commenting out this line allows the Vulkan descriptorindexing example to run with metal argument buffers enabled (see issue #2216).

However, there can be visual artifacts present such as missing cube faces, so while this change is an excellent clue, it is likely not the whole story. Screenshot 2024-04-30 at 9 12 08 AM

rcaridade145 commented 2 months ago

The debug could be a bit better on the spirv-cross side https://github.com/KhronosGroup/SPIRV-Cross/blob/main/spirv_msl.cpp#L73 It should print the value of binding.basetype

enum BaseType { Unknown, Void, Boolean, SByte, UByte, Short, UShort, Int, UInt, Int64, UInt64, AtomicCounter, Half, Float, Double, Struct, Image, SampledImage, Sampler, AccelerationStructure, RayQuery, ControlPointArray, Interpolant, Char };

The ones that are not caught are :

           Struct
           AccelerationStructure
    RayQuery,
    // Keep internal types at the end.
    ControlPointArray,
    Interpolant,
    Char
billhollings commented 2 months ago

PR #2231 should fix this issue.

Please retest with the latest MoltenVK, and close this issue if it fixes the problem

SRSaunders commented 2 months ago

This solves the problem for Sascha Willems' Vulkan descriptorindexing example. Thanks for fixing.

AlexanderDevaikinEnscape commented 2 months ago

I can confirm it solves problem of the Vulkan descriptorindexing example, but does not solve my shader compilation issue. Both old and new version produce identical output resulting in the same shader compilation error. I'll see if I can collect more details and/or share the shader here later today.

AlexanderDevaikinEnscape commented 2 months ago

I have tracked it down to the problematic image in one compute shader. It's ImgSrc who is guilty - if I use sampler2D instead of image2D it works fine. Here is the minimalistic repro shader:

layout(local_size_x = 32, local_size_y = 8, local_size_z = 1) in;

layout(r32f, binding = auto) readonly uniform image2D ImgSrc;
layout(r32f, binding = auto) writeonly uniform image2D ImgDst[6];

void main( void )
{
    float data = imageLoad(ImgSrc, clamp(ivec2(16), ivec2(0), ivec2(128))).x;
}

And debug dump:

[mvk-info] Compiling Metal shader with FastMath enabled.
[mvk-error] SPIR-V to MSL conversion error: Argument buffer resource base type could not be determined. When padding argument buffer elements, all descriptor set resources must be supplied with a base type by the app.
[mvk-info] Converting SPIR-V:
; SPIR-V
; Version: 1.4
; Generator: Google Shaderc over Glslang; 11
; Bound: 31
; Schema: 0
               OpCapability Shader
          %1 = OpExtInstImport "GLSL.std.450"
               OpMemoryModel Logical GLSL450
               OpEntryPoint GLCompute %4 "main" %11 %30
               OpExecutionMode %4 LocalSize 32 8 1
               OpSource GLSL 460
               OpSourceExtension "GL_EXT_debug_printf"
               OpSourceExtension "GL_GOOGLE_cpp_style_line_directive"
               OpSourceExtension "GL_GOOGLE_include_directive"
               OpName %4 "main"
               OpName %8 "data"
               OpName %11 "ImgSrc"
               OpName %30 "ImgDst"
               OpDecorate %11 DescriptorSet 0
               OpDecorate %11 Binding 0
               OpDecorate %11 NonWritable
               OpDecorate %26 BuiltIn WorkgroupSize
               OpDecorate %30 DescriptorSet 0
               OpDecorate %30 Binding 1
               OpDecorate %30 NonReadable
          %2 = OpTypeVoid
          %3 = OpTypeFunction %2
          %6 = OpTypeFloat 32
          %7 = OpTypePointer Function %6
          %9 = OpTypeImage %6 2D 0 0 0 2 R32f
         %10 = OpTypePointer UniformConstant %9
         %11 = OpVariable %10 UniformConstant
         %13 = OpTypeInt 32 1
         %14 = OpTypeVector %13 2
         %15 = OpConstant %13 16
         %16 = OpConstantComposite %14 %15 %15
         %17 = OpTypeVector %6 4
         %19 = OpTypeInt 32 0
         %20 = OpConstant %19 0
         %22 = OpTypeVector %19 3
         %23 = OpConstant %19 32
         %24 = OpConstant %19 8
         %25 = OpConstant %19 1
         %26 = OpConstantComposite %22 %23 %24 %25
         %27 = OpConstant %19 6
         %28 = OpTypeArray %9 %27
         %29 = OpTypePointer UniformConstant %28
         %30 = OpVariable %29 UniformConstant
          %4 = OpFunction %2 None %3
          %5 = OpLabel
          %8 = OpVariable %7 Function
         %12 = OpLoad %9 %11
         %18 = OpImageRead %17 %12 %16
         %21 = OpCompositeExtract %6 %18 0
               OpStore %8 %21
               OpReturn
               OpFunctionEnd

End SPIR-V

SPIR-V to MSL conversion error: Argument buffer resource base type could not be determined. When padding argument buffer elements, all descriptor set resources must be supplied with a base type by the app.

Partially converted MSL:

End MSL

Estimated original GLSL:
#version 460
layout(local_size_x = 32, local_size_y = 8, local_size_z = 1) in;

layout(set = 0, binding = 0, r32f) uniform readonly image2D ImgSrc;
layout(set = 0, binding = 1, r32f) uniform writeonly image2D ImgDst[6];

void main()
{
    float data = imageLoad(ImgSrc, ivec2(16)).x;
}

End GLSL

From the fix to spirv-cross I don't see how image2D vs. sampler2D should play any role though.

billhollings commented 2 months ago

glslang as used by the MVKShaderConverter tool, does not support auto, so I set the GLSL bindings to 0 & 1.

If I run your latest GLSL compute shader through the MVKShaderConverter tool, but with the tool hacked to enable padding, and to provide the two descriptor bindings (type, desc set, binding, count): (Image, 0, 0, 1) + (Image, 0, 1, 6), everything works.

Which leads me to wonder if you have a type and binding mismatch between your Vulkan descriptors and your shaders.

In the small compute shader example, how are your descriptor set layout bindings defined in Vulkan?

AlexanderDevaikinEnscape commented 1 month ago

Descriptor set layout looks valid to me:

Binding  0: 'ImgSrc' Type=StorageImage Count=1 Stages:{ Compute }
Binding  1: 'ImgDst' Type=StorageImage Count=6 Stages:{ Compute }

Tried to have only one binding and it solves the problem. No matter if ImgSrc or ImgDst is present - when it's the only binding in shader everything is fine. Descriptor set layout looks expected as follows for those two cases:

Binding  0: 'ImgSrc' Type=StorageImage Count=1 Stages:{ Compute }

and

 Binding  0: 'ImgDst' Type=StorageImage Count=6 Stages:{ Compute }
AlexanderDevaikinEnscape commented 1 month ago

I'll try to recreate it with descriptorindexing sample maybe.

AlexanderDevaikinEnscape commented 1 month ago

So I was able to create a repro case based on descriptorindexing example. Everything worked well until I've switched to image2D from sampler2D.

The code with the changes I've made is here: https://github.com/SaschaWillems/Vulkan/compare/master...AlexanderDevaikinEnscape:Vulkan:ArgumentBufferPaddingReproCase

GabeRundlett commented 1 month ago

I have the same (or similar) issue. Our Vulkan abstraction (Daxa) uses a bindless approach. Even if the image array is not used, the crash happens inside spirv-cross. I see bill said there was some fix in the latest MoltenVK. I believe I'm using the latest MoltenVK, as I updated it and the .dylib aliases the new 1.3.283.dylib. Here's my simple repro shader:

#extension GL_KHR_memory_scope_semantics : enable
#extension GL_EXT_buffer_reference : enable
#extension GL_EXT_buffer_reference2 : enable
#extension GL_EXT_scalar_block_layout : require

// This is the offending line
layout(rgba8, binding = 1, set = 1) uniform image2D my_image2D_table[];

struct MyVertex {
    vec3 position;
    vec3 color;
};
layout(buffer_reference, scalar, buffer_reference_align = 4) buffer BufferPtrMyVertex { readonly MyVertex value; };
layout(push_constant, scalar) uniform Push { BufferPtrMyVertex vertices; };

#if DAXA_SHADER_STAGE == 1 // VERTEX

layout(location = 0) out vec3 v_col;
void main() {
    MyVertex vert = (vertices + gl_VertexIndex).value;
    gl_Position = vec4(vert.position, 1);
    v_col = vert.color;
}

#elif DAXA_SHADER_STAGE == 4 // FRAGMENT

layout(location = 0) in vec3 v_col;
layout(location = 0) out vec4 color;
void main() {
    color = vec4(v_col, 1);
}

#endif

and the generated spirv binaries: spv.zip

if I comment out the problematic line, the shader runs just as expected (matches Windows/Linux):

Screenshot 2024-05-21 at 2 37 24 PM

When compiling a pipeline with the problematic SPIR-V, I don't get any validation messages, just a thrown exception. The call-stack led me to this line in spirv-cross, which is how I came to find this issue.

billhollings commented 1 month ago

So I was able to create a repro case based on descriptorindexing example. Everything worked well until I've switched to image2D from sampler2D.

The code with the changes I've made is here: SaschaWillems/Vulkan@master...AlexanderDevaikinEnscape:Vulkan:ArgumentBufferPaddingReproCase

Thanks. That was a very helpful modified sample.

PR #2243 should fix this. Please retest with latest MoltenVK, and close this issue if it is fixed.

AlexanderDevaikinEnscape commented 1 month ago

That works! You are awesome! But I cannot close this issue - someone with access rights should do it.