KhronosGroup / Vulkan-Portability

Apache License 2.0
40 stars 4 forks source link

Add VkPhysicalDevicePortabilitySubsetFeaturesKHR::shaderImageGatherExtendedConstOffsets capability #31

Closed billhollings closed 3 months ago

billhollings commented 3 years ago

The existing VkPhysicalDeviceFeatures::shaderImageGatherExtended capability controls two seemingly unrelated capabilities related to the use of offsets in the image gather operation:

Metal natively supports the first capability but not the second.

Add VkPhysicalDevicePortabilitySubsetFeaturesKHR::shaderImageGatherExtendedConstOffsets capability to indicate support for the second. Under current Metal implementations, it would be disabled.

This affects 136 CTS tests: dEQP-VK.glsl.texture_gather.offsets.*

kvark commented 3 years ago

Alternatively, we could issue 4 different fetches for this format of a call, and be conformant without an extra flag.

cdavis5e commented 3 years ago

It's easy enough to scale the texel coordinates by the size of the image. The problem is that we have to deal with the sampler. How do we know what to return if some of the four texels are outside the image--especially if we have a dynamic (as opposed to a static immutable) sampler?

kvark commented 3 years ago

@cdavis5e I think there is a misunderstanding. We don't need to scale anything. Just instead of calling Gather with ConstOffsets once, we can call Sample with ConstOffset 4 times.

billhollings commented 3 years ago

Yeah. I thought about using 4 samples...but figured the devil would appear in the details (as in @cdavis5e examples), and it all sounds like shader heroics for a capability that seems to have limited application.

With that in mind, adding flags in situations like this seems appropriate, IMHO. The flag can always be enabled if someone successfully engages those heroics on a platform-by-platform basis.

billhollings commented 3 years ago

We don't need to scale anything

We'd need the scaling, because sampling occurs in normalized coordinates, but figuring out which four pixels surrounds that point requires unnormalized coordinates.

kvark commented 3 years ago

Ok, yes, I see. Out of curiosity, do users request to use the "Offset" path on Gather when running on Metal?

billhollings commented 3 years ago

Out of curiosity, do users request to use the "Offset" path on Gather when running on Metal?

Are you hinting as to whether we should just disable VkPhysicalDeviceFeatures::shaderImageGatherExtended and be done with both extended capabilities?

billhollings commented 3 years ago

a capability that seems to have limited application

do users request to use the "Offset" path

Indeed! Somewhere, I would love to find compiled a list of why graphic features and capabilities exist, and in what scenarios they are used in the real world.

For example, in this case, who comes up with the need to gather pixels in a single sample command in the first place, and then further determines a need to offset the pixels deriving those components in four independent directions using an array of constant values? 🤔🤷🏻‍♂️

cdavis5e commented 3 years ago

For example, in this case, who comes up with the need to gather pixels in a single sample command in the first place,

I have needed to do this. It's what you need if you want to do linear filtering on a paletted texture--you want the four texels, but you don't want to blend them yet until after you do the palette lookup. So you use a Gather to get the four texels, which in this case will be indices into the palette; then you look up the palette entries with the four texels; then you can apply the linear filtering formulas to the four actual colors to get the blended color.

and then further determines a need to offset the pixels deriving those components in four independent directions using an array of constant values?

I have not needed to do this.

kvark commented 3 years ago

Thank you both for clarification! In the scenario @cdavis5e describes, wouldn't you use ConstOffset modifier instead of Offset? The const offset is much better supported and is probably faster.

cdavis5e commented 3 years ago

@cdavis5e I think there is a misunderstanding. We don't need to scale anything. Just instead of calling Gather with ConstOffsets once, we can call Sample with ConstOffset 4 times.

You said "fetch" here. So naturally, I assumed that, since texelFetch() (Load() in MSL) uses unnormalized coordinates, we would have to scale. Using Sample() has its own problems, particularly if the sampler uses linear filtering. One ostensible reason you'd want to use a Gather() in the first place is to perform the linear filtering yourself, like in my example above.

Thank you both for clarification! In the scenario @cdavis5e describes, wouldn't you use ConstOffset modifier instead of Offset? The const offset is much better supported and is probably faster.

MSL makes no such distinction between constant and non-constant offsets.

Another problem with using Load() that I forgot to mention is that of mip-level selection. How do we know which mip to load from? It depends on the mip filter and the du/dx and dv/dy values. You can get du/dx and dv/dy easily enough, but how do you know what mip filter should be used?

billhollings commented 3 years ago

How do we know which mip to load from?

My reading is that the gather operation operates only on the base mip level.

cdavis5e commented 3 years ago

How do we know which mip to load from?

My reading is that the gather operation operates only on the base mip level.

Ah, you are correct, sir.

billhollings commented 2 years ago

Recommend we add this flag.

billhollings commented 6 months ago

Suggestion from @alyssarosenzweig on VK_KHR_portability_subset_metal dev PR, to modify shader to use 4 gather operations:

This is straightforward to lower, as most native drivers do? https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/compiler/nir/nir_lower_tex.c?ref_type=heads#L1253

billhollings commented 3 months ago

SPIRV-Cross PR #2325 adds support for image gather with ConstOffsets.