KHR_texture_basisu: should more channel types be supported?

ghost commented 3 years ago

I've been using the toktx tool to try to make compatible textures for use with the KHR_texture_basisu extension. I noticed sometimes toktx will produce a single channel RRR ktx2 texture. According to the spec,

RGB textures MUST use a single DFD channel with value KHR_DF_CHANNEL_ETC1S_RGB. RGBA textures MUST use two DFD channels with values KHR_DF_CHANNEL_ETC1S_RGB and KHR_DF_CHANNEL_ETC1S_AAA.

It seems like KHR_DF_CHANNEL_ETC1S_RRR could be a reasonable choice for a texture. Why isn't this allowed? Is there an argument passed to toktx that could resolve this?

donmccurdy commented 3 years ago

I noticed sometimes toktx will produce a single channel RRR ktx2 texture

Which type of texture are you getting RRR for? It could make sense for an occlusion texture (without metal/rough channels). I'm not sure if that provides an advantage in quality and compression with Basis.

Is there an argument passed to toktx that could resolve this?

/cc @MarkCallow might know on this.

lexaknyazev commented 3 years ago

Why isn't this allowed?

Great question!

ETC1S_RRR, ETC1S_GGG, UASTC_RRR and UASTC_RRRG models are supported in KTX as a way to preserve some extra information about the source input. Coupled with KTXswizzle metadata, such information could be used at runtime for more advanced processing.

Let's take a single-channel (red only) texture for example. Sampling from an uncompressed format such as GL_R8 would yield r001 values. Compressing such data as-is with ETC1S would add extra noise in green and blue channels while distorting red channel due to the codec's implementation details. To get good results from ETC1S codec, data should be swizzled to rrr1 before compression. A KTXswizzle metadata value of r001 could be added to the compressed file to fully preserve the original intent.

At runtime, an application could have several non-trivial options for such compressed input:

transcode to any compressed RGB format as-is (assuming that green and blue channels are never read);
transcode to any compressed RGB format and set sampler swizzling state (if supported);
transcode to any compressed RGB format and patch shaders (if they rely on blue and green channels being zeros);
transcode to EAC_R11 or BC4 (assuming r001 swizzling metadata).

Yet another caveat here is that while ETC1S to ETC1 transcoding is lossless, any other conversion path is not. So it would be better (and also faster on CPU side) to use ETC1 with non-default swizzling than to transcode to EAC_R11. On the other hand, ETC1S to BC4 is a better path than ETC1S to BC1 for single channel data.

Since glTF and its material extensions always define fixed channel mapping and not all platforms support channel swizzling, these models were omitted from the extension for simplicity.

To fix the issue, we could think about relaxing the restriction a bit, namely:

Allow RRR model with r001 swizzling for cases like occlusion or transmission textures. This could improve GPU memory usage in some cases. Note, that glTF renderers are already able to detect such optimizations based on a material texture slot that uses the texture.
Allow RRR model with default swizzling as an alias to RGB for any greyscale textures. In that case, toktx wouldn't need any changes. Otherwise, we'd need to add a new cmd-line option to toktx to enforce RGB model for grayscale inputs.

WDYT?

ghost commented 3 years ago

Which type of texture are you getting RRR for?

From doing some testing, I think calling toktx --t2 --bcmp on a grey 1x1 image will trigger this. The content could probably drop the image and modulate the material factors.

MarkCallow commented 3 years ago

It seems like KHR_DF_CHANNEL_ETC1S_RRR could be a reasonable choice for a texture.

Do you mean that from 3 or 4 component input image you want to extract one of the components and make a 1 component texture? KHR_DF_CHANNEL_ETC1S_RRR is just an artifact of making a one component texture.

Why isn't this allowed?

I've been intending to add a general purpose swizzle argument to toktx to allow selection of any components from the input images to any components of the output but other things have had higher priority over this somewhat complex to implement feature.

In the meantime you can run the image through something like ImageMagick first to extract the components of interest into a new input image.

KHR_DF_CHANNEL_ETC1S_RRR is intended to indicate that the pre-compression image had a single component. The intended use is that an application either transcodes to a single component format (EAC_R11 or BC4) or if that is not possible, that the application should read only the R component of a 3 component transcode target texture either via swizzle or in-shader component selection.

ghost commented 3 years ago

I think I may have misinterpreted the spec. It states the requirements for RGB and RGBA formats, but doesn't mention R or RG. Is it implicit that all other incoming basis compressed pixel formats for 1 or 2 channels are permitted?

MarkCallow commented 3 years ago

It states the requirements for RGB and RGBA formats, but doesn't mention R or RG. Is it implicit that all other incoming basis compressed pixel formats for 1 or 2 channels are permitted?

Where is this in the spec, @stanlo? The sections on UASTC and ETC1S explicitly state what to do when encoding 1, 2, 3 & 4 component textures.

ghost commented 3 years ago

Could you link me to the part of the spec which does that? You're referring to the KHR_textures_basisu spec, not the KTX2 spec, right?

MarkCallow commented 3 years ago

@stanlo, I was referring to the KTX2 spec. Sorry. It is in Section3.10.2. Providing additional information. If KHR_texture_basisu doesn't cover R & RG already, it should.

MarkCallow commented 3 years ago

transcode to EAC_R11 or BC4 (assuming r001 swizzling metadata).

I don't think swizzling metadata is needed here. The KTX2 spec. makes it clear that KHR_DF_CHANNEL_ETC1S_RRR indicates a one component texture not a 3 component texture that happens to have the same value in all 3 channels.

ghost commented 3 years ago

If KHR_texture_basisu doesn't cover R & RG already, it should.

Yes, the KHR_texture_basisuextension that is part of the glTF2 spec explicitly covers RGB and RGBA in what appears to be the same manner as the KTX2 specification, but unlike the KTX2 specification, it does not cover R and RG. I originally believed this to mean any 2 or 1 channel KTX2 files, which are allowed by the KTX2 specification, are invalid glTF for some reason.

lexaknyazev commented 3 years ago

The update for R and RG textures is on the way.

Nevertheless, it does not affect textures that are semantically RGB (e.g. base color or emission) even if the original source image happens to be grayscale. KTX-Software may need some tuning to ensure that.

KhronosGroup / glTF

KHR_texture_basisu: should more channel types be supported? #1934