KhronosGroup / Vulkan-Portability

Apache License 2.0
40 stars 4 forks source link

Add VkPhysicalDevicePortabilitySubsetFeaturesKHR:: shaderFragmentDepth32FloatMultisample capability #44

Closed billhollings closed 1 year ago

billhollings commented 2 years ago

Some CTS tests fail when the shader sets a Depth32 depth value directly on multisample textures.

This might be fixable with effort within the implementations.

In the meantime, these use cases can be disabled in CTS and reported as Not Supported, by adding a VkPhysicalDevicePortabilitySubsetFeaturesKHR:: shaderFragmentDepth32FloatMultisample capability.

This affects 3 CTS tests:

dEQP-VK.glsl.builtin_var.fragdepth.line_list_d32_sfloat_multisample_4
dEQP-VK.glsl.builtin_var.fragdepth.triangle_list_d32_sfloat_multisample_2
dEQP-VK.glsl.builtin_var.fragdepth.triangle_list_d32_sfloat_multisample_4
  Fail
danginsburg commented 2 years ago

Can you please expand on what these tests do? Is it writing gl_FragDepth in the fragment shader where the depth buffer is MSAA D32? Why doesn't this work in Metal?

billhollings commented 2 years ago

Can you please expand on what these tests do? Is it writing gl_FragDepth in the fragment shader where the depth buffer is MSAA D32? Why doesn't this work in Metal?

Will indeed do that soon.

Right now, I'm doing a quick and dirty triage of the CTS failures to catalog additional flags that need to be added to VkPhysicalDevicePortabilitySubsetFeaturesKHR to offset the current CTS failure state of MoltenVK.

As I mention in each of these issues, it may turn out that some of these issues can be fixed within MoltenVK and SPRIV-Cross. And the intention is once the catalog is complete, we'll dive into a deeper triage and try to determine exactly that. As you hint at (and I expect), this issue is likely in that category.

cdavis5e commented 1 year ago

@billhollings So what exactly about this leads you to believe that this can be fixed on our end with effort? I am currently investigating this.

cdavis5e commented 1 year ago

I think I've figured out what's wrong.

The tests work in two passes. The first pass actually draws the primitive and uses FragDepth to set the fragment depth. The fragment shader discards any fragment whose x coordinate is 4. (This is important.) It also sets pixels in a mask image with a pixel for each sample--the idea is that only the samples where the mask has a 1 should have nonzero depth. The extra pixels are added to the width of the image. One pixel is set for each invocation that reaches that point in the shader function.

The second pass runs over the unresolved multisampled depth image and copies its values out to an image of the same dimensions and layout as the mask. The mask and the result are then copied out to buffers, and the depth buffer is compared with its expected values subject to the mask.

The problem is that even though the fragment was discarded when x == 4, the first fragment shader is still writing a 1 to the mask if there were a fragment to begin with there. This is a violation of the SPIR-V and Vulkan specs, which say that all memory stores following an OpKill or an OpDemoteToHelperInvocation are discarded, regardless of whether or not the shader continues execution. (See the SPIR-V spec here and here, and Vulkan §9.3.1 and §9.26.)

The solution, then, is to have SPIRV-Cross make sure no writes happen after an OpKill or an OpDemoteToHelperInvocation--or for that matter, an OpTerminateInvocation. I really don't like that we have to do this. But Apple's broken implementation of demotion semantics for discard_fragment() has forced our hand. Perhaps we should also file a feedback report for this.

billhollings commented 1 year ago

Wow! Impressive sleuthing! 🕵🏼‍♂️

the SPIR-V and Vulkan specs, which say that all memory stores following an OpKill or an OpDemoteToHelperInvocation are discarded, regardless of whether or not the shader continues execution.

Isn't this also the behavior expected with Metal? According to the MSL doc, writes before discard_fragment() are kept, but writes after are discarded, the same as Vulkan.

cdavis5e commented 1 year ago

Sorry for the delay; I didn't see this comment until now.

the SPIR-V and Vulkan specs, which say that all memory stores following an OpKill or an OpDemoteToHelperInvocation are discarded, regardless of whether or not the shader continues execution.

Isn't this also the behavior expected with Metal? According to the MSL doc, writes before discard_fragment() are kept, but writes after are discarded, the same as Vulkan.

I know from my experience with implementing Windows APIs for Wine that what the docs say and how the software actually behaves are two different things.

I do believe this is a bug in Metal. For this reason, I've filed FB11736212. In the meantime, I have a workaround based on my described solution, which I'll be upstreaming shortly.

cdavis5e commented 1 year ago

This has been fixed in SPIRV-Cross.