KhronosGroup / KTX-Software

KTX (Khronos Texture) Library and Tools
Other
837 stars 222 forks source link

Improve handling of greyscale PNG inputs #375

Closed lexaknyazev closed 3 years ago

lexaknyazev commented 3 years ago

Currently, toktx produces RRR and RRR + GGG slices for Greyscale and Greyscale with alpha PNG sources respectively. Such mapping should be an opt-in for the two following reasons.

Consistency

PNG defines greyscale as (3.1.21):

image representation in which each pixel is defined by a single sample of colour information, representing overall luminance (on a scale from black to white), and optionally an alpha sample (in which case it is called greyscale with alpha).

In GPU texture terms, luminance is usually interpreted as (OpenGL ES 3.2, 8.4.2.4):

If the format is LUMINANCE, then each group of one element is converted to a group of R, G, and B (three) elements by copying the original single element into each of the three new elements. If the format is LUMINANCE_ALPHA, then each group of two elements is converted to a group of R, G, B, and A (four) elements by copying the first original element into each of the first three new elements and copying the second original element to the A (fourth) new element.

The fact that a particular PNG uses Greyscale or Greyscale with alpha as its internal pixel format cannot be a definitive marker of a single- or dual-channel texture in GPU sense. Therefore, current behavior of toktx is too speculative.

There are two possible solutions for correctly handling greyscale PNG inputs:

While the second option is arguably more semantically correct, it would make KTX assets much less portable, so the first one should be the default behavior for practical reasons (there's no encoding difference anyway).

Transfer function support

PNG defaults to sRGB (or gamma 2.2) transfer. Therefore, KTX-Software sets DFD's transfer function to KHR_DF_TRANSFER_SRGB. Platform support for Red and Red-Green textures with non-linear transfer is far from convincing:

Without explicit swizzling metadata, this issue amplifies the first one.

lexaknyazev commented 3 years ago

Here are six test inputs for validation.

MarkCallow commented 3 years ago

A couple of points while I give this more thought:

  1. LUMINANCE format textures are deprecated in OpenGL and non-existent in Vulkan. Apps wishing to have luminance have to either apply a swizzle or do the swizzling in a shader.
  2. toktx could add swizzle metadata in the case of PNG luminance and luminance+alpha inputs but that doesn't help the universal texture case much because swizzzling is not universally supported.
  3. But toktx is creating a luminance texture by copying the single element into the 3 new elements. Given no. 1 this seems wrong for uncompressed textures but what is the choice for universal textures or block-compressed formats except the R & RG variants. I'm not sure why labeling the slices as RGB is better.
  4. The RRR and GGG channel ids are intended to provide helpful information for selection of a transcode target format. When matching targets are not available, the application will have to work around it by, for example, creating 2 textures one with the R and one with G.
lexaknyazev commented 3 years ago

LUMINANCE format textures are deprecated in OpenGL and non-existent in Vulkan. Apps wishing to have luminance have to either apply a swizzle or do the swizzling in a shader.

Exactly. PNG's greyscale format semantically maps to the deprecated luminance, not red.

lexaknyazev commented 3 years ago

I'm not sure why labeling the slices as RGB is better.

Let's go though two practical examples.

A source greyscale PNG image is used as e.g. the base color texture. It's possible to open it in any image viewer and see that it is indeed greyscale. For rendering purposes, image decoder unpacks that single channel as RRR1, the GPU texture has R8G8B8(A8)_SRGB format, and shaders use .rgb accessors as usual. At any point in time, that PNG image could be replaced with a colorful version with zero implementation changes.

What happens when the same grayscale PNG source gets converted to KTX / ETC1S (assuming that swizzling metadata is forbidden)?

MarkCallow commented 3 years ago

I need a concrete proposal to act on. As noted, there are 2 ways to handle "luminance" inputs:

  1. Make a 1-component texture (2 in the case of luminance-alpha) and add swizzle metadata.
  2. Make a 3-component texture (4 for L-A), i.e. swizzle the data before encoding the texture.

Questions are:

My opinion is that the default should be 1 and should be the same for all formats. I have no good suggestions for naming at this point, --3comp_luminance? --expand_luminance?

lexaknyazev commented 3 years ago

Which should be the default?

I'd prefer the second option because:

Should the default be dependent on the output format, e.g., 1 for uncompressed formats and 2 for all others

Taking sRGB transfer into account, the complexity of this decision tree would get out of control.

what should be the name of the option?

<dt>--la_to_rg</dt>
<dd>Treat grayscale and grayscale-alpha inputs as 1-component red and 2-component red-green textures respectively.</dd>

Grayscale data may appear in all supported input formats: JPEG, PNG, and Netpbm.

MarkCallow commented 3 years ago

How to handle PGM and PAM files with TUPLETYPE "GRAYSCALE" or "GRAYSCALE_ALPHA"? The PAM specification does not describe it as luminance. The tool for converting from pgm to ppm (gray to color) supports all kinds of mappings from the grey values to colors so it does not provide a definitive answer.

Given the lack of a clear statement, the most logical thing is to handle them consistently with .png greyscale files. Thoughts.

lexaknyazev commented 3 years ago

Yes, pgmtoppm does not have a default mapping, so let's align PGM inputs with PNG.

MarkCallow commented 3 years ago

Fixed in PR #387 which is included in Release Candidate 1 (v4.0.0-rc1).