Closed billhollings closed 3 months ago
Alternatively, we could issue 4 different fetches for this format of a call, and be conformant without an extra flag.
It's easy enough to scale the texel coordinates by the size of the image. The problem is that we have to deal with the sampler
. How do we know what to return if some of the four texels are outside the image--especially if we have a dynamic (as opposed to a static immutable) sampler?
@cdavis5e I think there is a misunderstanding. We don't need to scale anything. Just instead of calling Gather
with ConstOffsets once, we can call Sample
with ConstOffset
4 times.
Yeah. I thought about using 4 samples...but figured the devil would appear in the details (as in @cdavis5e examples), and it all sounds like shader heroics for a capability that seems to have limited application.
With that in mind, adding flags in situations like this seems appropriate, IMHO. The flag can always be enabled if someone successfully engages those heroics on a platform-by-platform basis.
We don't need to scale anything
We'd need the scaling, because sampling occurs in normalized coordinates, but figuring out which four pixels surrounds that point requires unnormalized coordinates.
Ok, yes, I see. Out of curiosity, do users request to use the "Offset" path on Gather when running on Metal?
Out of curiosity, do users request to use the "Offset" path on Gather when running on Metal?
Are you hinting as to whether we should just disable VkPhysicalDeviceFeatures::shaderImageGatherExtended
and be done with both extended capabilities?
a capability that seems to have limited application
do users request to use the "Offset" path
Indeed! Somewhere, I would love to find compiled a list of why graphic features and capabilities exist, and in what scenarios they are used in the real world.
For example, in this case, who comes up with the need to gather pixels in a single sample command in the first place, and then further determines a need to offset the pixels deriving those components in four independent directions using an array of constant values? 🤔🤷🏻♂️
For example, in this case, who comes up with the need to gather pixels in a single sample command in the first place,
I have needed to do this. It's what you need if you want to do linear filtering on a paletted texture--you want the four texels, but you don't want to blend them yet until after you do the palette lookup. So you use a Gather
to get the four texels, which in this case will be indices into the palette; then you look up the palette entries with the four texels; then you can apply the linear filtering formulas to the four actual colors to get the blended color.
and then further determines a need to offset the pixels deriving those components in four independent directions using an array of constant values?
I have not needed to do this.
Thank you both for clarification!
In the scenario @cdavis5e describes, wouldn't you use ConstOffset
modifier instead of Offset
? The const offset is much better supported and is probably faster.
@cdavis5e I think there is a misunderstanding. We don't need to scale anything. Just instead of calling
Gather
with ConstOffsets once, we can callSample
withConstOffset
4 times.
You said "fetch" here. So naturally, I assumed that, since texelFetch()
(Load()
in MSL) uses unnormalized coordinates, we would have to scale. Using Sample()
has its own problems, particularly if the sampler uses linear filtering. One ostensible reason you'd want to use a Gather()
in the first place is to perform the linear filtering yourself, like in my example above.
Thank you both for clarification! In the scenario @cdavis5e describes, wouldn't you use
ConstOffset
modifier instead ofOffset
? The const offset is much better supported and is probably faster.
MSL makes no such distinction between constant and non-constant offsets.
Another problem with using Load()
that I forgot to mention is that of mip-level selection. How do we know which mip to load from? It depends on the mip filter and the du/dx and dv/dy values. You can get du/dx and dv/dy easily enough, but how do you know what mip filter should be used?
How do we know which mip to load from?
My reading is that the gather operation operates only on the base mip level.
How do we know which mip to load from?
My reading is that the gather operation operates only on the base mip level.
Ah, you are correct, sir.
Recommend we add this flag.
Suggestion from @alyssarosenzweig on VK_KHR_portability_subset_metal
dev PR, to modify shader to use 4 gather operations:
This is straightforward to lower, as most native drivers do? https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/compiler/nir/nir_lower_tex.c?ref_type=heads#L1253
SPIRV-Cross PR #2325 adds support for image gather with ConstOffsets.
The existing
VkPhysicalDeviceFeatures::shaderImageGatherExtended
capability controls two seemingly unrelated capabilities related to the use of offsets in the image gather operation:Offset
- single calculatedint2
offset applied to the sampler coordinates for all gather componentsConstOffsets
-int2[4]
array of constant values, one value applied to each gather component individuallyMetal natively supports the first capability but not the second.
Add
VkPhysicalDevicePortabilitySubsetFeaturesKHR::shaderImageGatherExtendedConstOffsets
capability to indicate support for the second. Under current Metal implementations, it would be disabled.This affects 136 CTS tests:
dEQP-VK.glsl.texture_gather.offsets.*