KhronosGroup / MoltenVK

MoltenVK is a Vulkan Portability implementation. It layers a subset of the high-performance, industry-standard Vulkan graphics and compute API over Apple's Metal graphics framework, enabling Vulkan applications to run on macOS, iOS and tvOS.
Apache License 2.0
4.64k stars 402 forks source link

Need to correctly handle vertex attribute alignment (vec3, potentially more) #2182

Open aitor-lunarg opened 3 months ago

aitor-lunarg commented 3 months ago

Affected CTS tests:

dEQP-VK.pipeline.monolithic.vertex_input.multiple_attributes.binding_one_to_many.attributes.vec3.mat2.mat3

Vulkan states (https://registry.khronos.org/vulkan/specs/1.3-extensions/html/vkspec.html#fxvertex-input-extraction): If format is a packed format, attribAddress must be a multiple of the size in bytes of the whole attribute data type as described in Packed Formats. Otherwise, attribAddress must be a multiple of the size in bytes of the component type indicated by format (see Formats)

However, Metal has tighter requirements (Metal Shading Language Specification part 2.2)

The CTS test caught vec3 issues. As an example, we need to modify float3 to packed_float3 in the translation to MSL instead of trying to read it as float3 since that type requires 16 byte alignment when the buffer does not follow that. Unsure if SpirV-Cross changes may be needed. packed_float3 are not allowed as attributes so an alternative solution needs to be found.

Ideally we also verify if there's any other affected case CTS testing does not catch.

Potentially related #1609

aitor-lunarg commented 2 months ago

More detailed information on the issue:

Test allocates a buffer to store 2 attributes that will use later when drawing. The first attribute's type is short3 (in Vulkan terms VK_FORMAT_R16G16B16_SINT). The second attribute type is not really relevant to replicate the issue as long as it's also a vec3 type. Under these conditions the test will always fail. Visualization of the memory layout in the vertex buffer: | a0.x | a0.y | a0.z | a1.x | a1.y | a1.z | However, as mentioned in the first comment, Metal expects the following layout for vec3 formats. The following layout will make the test pass: | a0.x | a0.y | a0.z | pad | a1.x | a1.y | a1.z | (note there's no padding after a1, we only pad first attribute)

The other changes other than padding the buffer that make the test also pass are:

Which raises the question of why do those 2 changes make the test work (haven't found any answer to those yet).

The way to fix this from MoltenVK would be to add the padding ourselves whenever the user does not. This would require tracking buffers to understand when those are written to and reallocate and fill them with padding before usage. The change would imply a hit on performance due to the extra work required to be done by MoltenVK.

Additionally, we could create a similar flag, or modify the existing flag VkPhysicalDevicePortabilitySubsetPropertiesKHR::minVertexInputBindingStrideAlignment to apply to the attributes instead of the stride (letting users know they need to align to 4 bytes the start of the attributes. This means starting offset, and starting offset plus stride).

I personally would like MoltenVK to take the first approach in the long run, but it may be interesting to pursue second option in the short term and come back to this issue once there's more time to invest here.

Thoughts @billhollings