KhronosGroup / Vulkan-Docs

The Vulkan API Specification and related tools
Other
2.7k stars 452 forks source link

`VkSamplerYCbCrConversion` should document that no transfer function is applied #2356

Open colinmarc opened 2 months ago

colinmarc commented 2 months ago

Hi, I've encountered a confusing area of the spec that I believe could use some clarification.

My understanding is that when sampling a YUV semi- or multiplanar texture with a VkSamplerYCbCrConversion in a shader, the resulting sampled values are generally the nonlinear R', G', and B'. That's because of how YCbCr works; it's an extra encoding step on top of RGB values that were already nonlinearized by some transfer function[^1][^2]. So for example if an sRGB picture is encoded in YCbCr using the BT.709 conversion, the values in the shader will be nonlinear sRGB values.

It's not a big problem for the user to do that step themselves in the shader, but it does fly a bit in the face of intuition, given that sampling sRGB textures always results in linear values - Vulkan does the linearization for the user. Furthermore, even a savvy user might be confused about the fact that they passed in a ycbcrModel, such as VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_709, but that that only represents the matrix transform, not the transfer function applied to the underlying color values. This is extra confusing since Rec. 709 (the document) specifies both the transfer function and the matrix transform.

The easiest API for users would add a field to VkSamplerYcbcrConversionCreateInfo to specify the transfer function - defaulting to linear, obviously, for backwards compatibility - so that shaders can operate only on linear values just like they do with sRGB textures. Failing that, I wonder if this could not be called out a bit more explicitly in the documentation. For example, the following note could be added to the documentation for ycbcrModel:

Note: this refers to the YCbCr conversion into R', G', and B', not to the transfer function applied to produce those values, despite the transfer function and the YCbCr conversion often having the same name.

Thanks for reading!

[^1]: The reason for this is historical, as discussed in ITU report BT.2246-8: it was easier to dump the nonlinear values out to CRT screens if they were already in the right space. [^2]: With the exception of the BT.2020-CL conversion, which operates on "constant luminance".

fluppeteer commented 1 month ago

Sorry for the delayed response.

We could certainly add a note to make it more obvious that the transformation applies only to the colour matrix; I'll try to ensure that happens, and apologies for any confusion caused in the meantime.

The lack of transfer function support was because, at least at the time, nobody had dedicated hardware to implement any transfer function other than sRGB (I'm not aware of this having changed). Since it would be a software implementation anyway and developers may wish to take short cuts (such as ignoring the linear region), and may be unnecessary in the simple case of copying converted data, it was left to the user. Strictly speaking, using the ITU transfer functions on BT.601/709/2020 gives you linear illumination in scene-referred space, but since there is an implicit OOTF you may or may not want to be filtering in that space anyway (you may want to apply the OOTF and work in display-referred space - or not, depending on what you're doing).

With the increasing support for Vulkan video and the proliferation of HDR, we might start to see more support in the future, so I would not rule out the API getting an update that might include both YcCbcCrc (as you mention for "constant luminance") and HLG, and ICtCp/PQ support - and potentially also HDR still formats (now HEIF is popular). I wouldn't hold your breath, but when and if that comes and people have hardware support, we'd certainly want to make the transfer function configurable; I'm a bit wary of updating the API to include any transfer function without adding full flexibility, and I suspect there won't be wider support until hardware is widespread.

If it helps, our colleagues at Qualcomm point out the existence of VK_QCOM_ycbcr_degamma, which applies the sRGB transfer function to make the image... less nonlinear. They note that this is also not necessarily doing the transfer function in "the right place" relative to the filtering, but it can save some shader work if the approximation is good enough.

Thanks for your interest in this area and for the feedback!