KhronosGroup / glTF

glTF – Runtime 3D Asset Delivery
Other
7.2k stars 1.14k forks source link

glTF 2.0 Texture Formats #835

Closed lexaknyazev closed 6 years ago

lexaknyazev commented 7 years ago

glTF 2.0 being runtime API-agnostic format must have a robust and extendable texture handling framework.

Right now, glTF has a very simple image object for referencing images of Web-browser compatible codecs (such as JPG or PNG) and a texture object for specifying desired GPU representation. glTF 1.0 doesn't provide any standard way to use native GPU formats (both compressed and uncompressed). Also it lacks support of more advanced texture usage patterns, such as cubemaps or MIP-levels.

There was an attempt (#739) to refactor image and texture object in a way that would enable aforementioned use cases, however that proposal came way before glTF 2.0 API-neutrality strategy change.

Remember, we don't want to introduce any breaking changes after glTF 2.0 release, so such core functionality must be as future-proof as possible. While we don't need to enable every possible format and feature now, we must be sure that future changes could be done in a non-destructive way.


State of cross-API formats support

GPU formats

Here's an overview of GPU texture formats that could be supported across different APIs (OpenGL ES 2.0/3.0, OpenGL 4.5, Vulkan, Metal, D3D11/12). Keep in mind, that in some cases API support doesn't guarantee hardware support. OpenGL ES 2.0 supports very few of them, so it's present there only for reference.

Uncompressed (byte-aligned) formats

The following table contains a matched list of byte-aligned GPU formats. Formats, exclusive to only one API are not mentioned.

image

Packed formats

The following table contains a matched list of packed GPU formats.

image

Compressed formats

The following table contains a matched list of compressed GPU formats. Actual support varies both by API and hardware. Note, that ETC1 format isn't mentioned there, however ETC2-enabled systems support it.

image

Web formats

These formats are universally supported in web browsers, but supporting them in mobile or embedded environment could be inefficient. They require client-side decompression, hence, client RAM and CPU cycles. GPU will got uncompressed data. Client can do recompression to GPU-friendly format, hence even more processing.

BMP

Many versions of format exist. Could have anything from 1 bit-per-pixel BW to 32-bit RGBA. Browser support for rare combinations varies (e.g., look at comments in the Chromium source). Why was it allowed in the glTF 1.0?

GIF

Could have 24-bit RGB colors, alpha is limited to 1-bit. LZW compression.

PNG

Could have L, LA, RGB, or RGBA data with up to 16-bit per channel. Deflate compression.

JPEG

Could have 24-bit RGB or 8-bit Luminance. No alpha.

State of glTF texture support

Issues

KTX proposal

To address some of issues above, KTX format support was proposed in #739. Main changes include (keep in mind that they are breaking from 1.0, so such decision could only be done with major version upgrade):

Such layout will allow adding more container formats (like Crunch or Basis) in post-2.0 minor updates.

Example (adapted from #739)

This is just an example, comments are welcome. undefined used for illustrative purposes.

{
    "textures": [
        {
            "image": 0,
            "sampler": 0
        }
    ],
    "images": [
        {
            "formats": [
                {
                    "format": 33779, // GL_COMPRESSED_RGBA_S3TC_DXT5_EXT
                    "mimeType": "image/ktx",
                    "uri": undefined,
                    "bufferView": 0
                },
                {
                    "format": 32856, // GL_RGBA8
                    "mimeType": "image/png",  // valid only when "KHR_image_web" enabled
                    "uri": undefined,
                    "bufferView": undefined,
                    "extensions": {
                        "KHR_image_web": {
                            "flipY": true,
                            "width": 256,
                            "height": 256,  // specify 0 for a 1D texture
                            "depth": 1, // optional; depth of mip level 0 of a 3D texture; must be one for 2D and cube textures
                            "layers": 1, // optional; used for 2D texture arrays and cube map arrays
                            "faces": 1, // optional; 1 or 6 (cube map)
                            "levels": 2, // optional; 0 means run-time should call generateMipmap()
                            "uri": undefined,
                            "bufferViews": [  // list images in the order: depth-layers-faces-levels
                                1,
                                2
                            ]
                        }
                    }
                }
            ]
        }
    ]
}

Final remarks

For mass assets distribution it's vital to use compressed image data. At some point, we should expect libraries like Basis or Crunch to be universally integrated in exporting workflows.

KTX usage of more than one parameter (glFormat, glInternalFormat, glBaseInternalFormat, glType) to describe image format is sub-optimal. It also locks valid formats to those supported by GL.

KTX data layout isn't streaming-friendly: it's hard to fetch low-res image first.

More modern container (KTX2?) could be proposed later.

References

CC @pjcozzi @javagl @sbtron @bghgary @AurL @cedricpinson @mlimper @lasalvavida

lexaknyazev commented 7 years ago

Actual integration of GPU formats into the core spec will require diligent checking of all combinations of sampler/filtering modes for each format for each API. This is out of 2.0 scope.

Nevertheless, I think we should evaluate and do needed syntax/binding changes to be able to add those formats in the next minor update.

pjcozzi commented 7 years ago

Generally looks OK, comments:


Being pragmatic, I am hesitant about introducing KHR_image_web; either all implementations will be forced to implement it or it will lead to fragmentation as exporters prefer to export well-known image formats, and then only some glTF renders can load the model. Instead, consider just removing support for BMP and GIF. PNG and JPEG decoders are widely available AFAIK and give glTF some compression options out of the box.

Note that the current glTF situation is much better than most 3D formats; for example, COLLADA doesn't define at all what image files are valid.


Supporting KTX has lots of implicit requirements (like different compression formats) so the spec will have to define that KTX is supported with a precise set of limitations.


I'm not sure about the height (3D texture), depth (3D texture mip level 0), and layers (2D texture arrays) properties as they can't be implemented in WebGL 1.

Is the goal to design the schema now so that it will be compatible when these are added later?

lexaknyazev commented 7 years ago

PNG and JPEG decoders are widely available

Generally, I'm OK with them in core. Maybe this could be a conformance/implementation note for mobile deployment. See these comments from #739 on implications: https://github.com/KhronosGroup/glTF/issues/739#issuecomment-252444886 https://github.com/KhronosGroup/glTF/issues/739#issuecomment-252499701

Supporting KTX has lots of implicit requirements

That's why my perspective on 2.0 is to allow only glTF 1.0 targets/formats/types. Maybe additionally allow something for PBR, if needed (like cubemaps and mips).

Is the goal to design the schema now so that it will be compatible when these are added later?

Exactly! Biggest struggle would be GPU formats zoo across different APIs (e.g., look at 5551 vs 1555 layouts). Some formats could be easily (and losslessly) converted, while some introduce big overhead. Many examples in ANGLE source code.

pjcozzi commented 7 years ago

PNG and JPEG decoders are widely available

Generally, I'm OK with them in core. Maybe this could be a conformance/implementation note for mobile deployment.

Conformance note is OK with me.

Supporting KTX has lots of implicit requirements

That's why my perspective on 2.0 is to allow only glTF 1.0 targets/formats/types. Maybe additionally allow something for PBR, if needed (like cubemaps and mips).

Sounds perfect. Can the PBR folks chime in? @sbtron @bghgary @cedricpinson @mlimper?

lexaknyazev commented 7 years ago

How to handle URIs/bufferViews for web codecs? E.g., make both properties arrays and define that with "mimeType": "image/ktx" they must contain no more than one element.

pjcozzi commented 7 years ago

I'm not following. Can you provide an example?

lexaknyazev commented 7 years ago

In the example above, image.formats[].uri and image.formats[].bufferView are value-properties (not arrays) for KTX images. In the PNG/JPEG extension, these fields are arrays to support KTX-features (i.e., treat several 2D images as one "texture": cubemaps, mips, etc).

lexaknyazev commented 7 years ago

Compare uri and uris below:

{
    "images": [
        {
            "formats": [
                {
                    "format": 33779, // GL_COMPRESSED_RGBA_S3TC_DXT5_EXT
                    "mimeType": "image/ktx",
                    "uri": "img.ktx"
                },
                {
                    "format": 32856, // GL_RGBA8
                    "mimeType": "image/png",
                    "levels": 2,
                    "uris": [
                        "mip0.png",
                        "mip1.png"
                    ]
                }
            ]
        }
    ]
}
pjcozzi commented 7 years ago

Ah, I see. Yes, I think always using arrays (with length === 1 for KTX) is reasonable.

lexaknyazev commented 7 years ago

Looks like we should also specify "premultipliedness" of alpha, or demand corresponding WebGL flag to be always on for some maps. See this article from https://github.com/KhronosGroup/glTF/issues/822#issuecomment-274856972.

javagl commented 7 years ago

Many points here are beyond what I can comprehend (I just started reading about KTX and texture compression in general).

But the uri[]/bufferView[] arrays confuse me a bit. Particularly, I wonder about the data model that is implied by such an image.format object.

So each of these objects will have an array of data chunks (e.g. ArrayBuffer objects), right?

For the particular case of MipMaps, wouldn't it be necessary to store the width/height for each of them, or are there some assumptions or standards for how the resolution of the lower levels is related to the highest level?

lexaknyazev commented 7 years ago

But the uri[]/bufferView[] arrays confuse me a bit. Particularly, I wonder about the data model that is implied by such an image.format object.

In some cases, texture consists of several "images". E.g., MIP-levels, Cubemaps, Array or 3D Textures (ES 3.0). Since PNG and JPEG containers allow only one image per file, we need a way to transmit several "images". With KTX, those arrays must contain only one element.

javagl commented 7 years ago

Thanks, I understood this so far (assuming that each image.format object will have multiple data blocks).

But won't it be necessary to store more information for each one?

This mainly refers to the different resolutions for different mipmap levels.

(Assuming that nobody wants to create one mipmap level from a PNG, and another (of the same MipMap) from a JPG...)

lexaknyazev commented 7 years ago

But won't it be necessary to store more information for each one?

In case of KTX file, only URI/bufferView is required because all other properties are provided in the KTX binary header. No need to duplicate them.

As for JPEG/PNG set of images, these properties have exact well-defined meaning:

For glTF 2.0, I would consider only faces and levels (if we need such features for PBR), since such functionality is supported with WebGL 1.0.

javagl commented 7 years ago

So there may be an image.format object like this:

{
    "format": 32856,
    "width": 256,
    "height": 256,
    "levels": 2,
    "bufferViews": [ 1, 2 ]
}

What is the resolution of the image data referred to by bufferView 2? Is it always 128x128? (I'm not familiar with many concepts here, so apologies if this is a stupid question)

lexaknyazev commented 7 years ago

Is it always 128x128?

Yes, for 256x256 level 0. See p. 3.7.7 of OpenGL ES 2.0 Spec.

As for your example: "mimeType" is required, width / height not needed for 2D textures, since these dimensions are available in binary headers of all containers.

javagl commented 7 years ago

these dimensions are available in binary headers of all containers.

I thought that in a case like the one above, the bufferView would contain the actual image data, as a sequence of bytes representing the GL_RGBA8 values. But I probably have to read this issue and related documents a few more times. Until then, I'll wait with any attempts of implementing an infrastructure for "reading images". (In glTF 1.0, I had some simple map from imageId to byte[] data. Now, I'm not sure what the final structures will look like)

lexaknyazev commented 7 years ago

glTF 1.0 hasn't got any standard way to use raw texture data (not counting KHR_binary_gltf extension).

Main benefit of using KTX container is existing infrastructure for converting to and from it:

javagl commented 7 years ago

I'm still trying to understand the implications of this issue. Particularly regarding the resulting data structures, and how the data is supposed to be read, stored and passed to the graphics API.

Are the following statements true? :

(Particularly, I'm not sure if case 2 and 3 are supposed to be supported)

lexaknyazev commented 7 years ago

uri and bufferView are mutually exclusive

Yes.

when the mimeType is given, ...

mimeType is required.

There's no such thing as "raw" data, because it would significantly complicate loading anything beyond one-level-one-face-2d-texture.

javagl commented 7 years ago

There's no such thing as "raw" data, because it would significantly complicate loading anything beyond one-level-one-face-2d-texture.

OK, that wasn't clear to me. Again, I'm not so deeply involved here, but thought that it could be possible to roughly have something like this (pseudocode) :

{
    "format" : GL_SOME_CUBE_MAP_TYPE,
    "faces" : 6,
    "bufferViews" : [
        0,  // the bufferView containing the raw RGBA data for the +x side
        1,  // the bufferView containing the raw RGBA data for the -x side
        ...
        5,  // the bufferView containing the raw RGBA data for the -z side
    ]
}

For me, "encodedData+mimeType" was basically equivalent to "raw data" (even though, of course, it's a trade-off of file size vs. decoding effort). I didn't see why something like this should not be supported. But now it's clear: The bufferView references will always contain encoded data (like JPG data). Sorry, I didn't want to open a can of worms here.

lexaknyazev commented 7 years ago

KTX has a simple fixed-size header at the beginning. The rest is just a fixed-order concatenation of all "bufferViews" from your example.

theanohana commented 7 years ago

How can i let png become a gltf mesh with a ktx texture.

robertlong commented 7 years ago

Would there be any required image format (KTX)? Or would a GLTF 2.0 compatible renderer have to support KTX, png, and jpeg?

sbtron commented 7 years ago

glTF 2.0 will stick to png and jpeg. Using KTX is an exploration to ensure we can add it in a future update in a compatible manner.

robertlong commented 7 years ago

It sounds like KTX will not make its way into GLTF 2.0.

Is there any proposal for adding support as an extension?

KTX would bring huge optimizations to UnityGLTFLoader. Unity currently handles runtime png/jpg loading very poorly and having the option to load a compressed texture when available would help a lot.

pjcozzi commented 7 years ago

AFAIK there's no current work on a KTX extension, but you are welcome to get the ball rolling on one!

sakrist commented 6 years ago

Hi there!

I've put together my ideas for extension to support compressed textures in glTF. Here you can check how it looks like.

Main use case is be online extension, i.e. "client request glTF file with specific compression type from hosted application on remote server."

TimvanScherpenzeel commented 6 years ago

The KTX extension would make most sense in my opinion (as opposed to separate extensions for DDS / PVR). In an effort to have a single tool I've created https://github.com/timvanScherpenzeel/texture-compressor which is heavily based on the compressed texture generation tooling in https://github.com/AnalyticalGraphicsInc/gltf-pipeline. (My apologies if this sounds like a promotion for my tool, it is merely ment as a way to show that is it possible).

ASTC, ETC, PVRTC and S3TC are all wrapped in a KTX container and able to be decoded correctly using KTXLoader in Three.js. Apart from some smaller issues (like missing mipmapping support in https://github.com/ARM-software/astc-encoder) this appears to work fine.

robertlong commented 6 years ago

I'd also like to see cross platform support for compressed textures. There have been talks about adding support for a Universal Compressed Texture Format however it is not clear what licenses are needed to encode/transcode/decode these textures and when we can expect the extension to be made available to the public.

Until we have the universal compressed texture extension it would be nice to be able to use existing compressed texture formats. A KTX extension and the ability to specify multiple image formats would fill this space in the interim.

As @sakrist mentioned we also have this "client request glTF file with specific compression type from hosted application on remote server" use case in Mozilla Hubs. Currently png/jpeg image decoding is causing a lot of hitching in our app. WebGL doesn't have a great way to offload the cost of decoding these images to another thread. @takahirox has been doing amazing work on the ThreeJS GLTFLoader within the limitations browsers have right now. Adding support for cross-platform compressed textures would help reduce that cost even more.

I like @lexaknyazev's original proposal for an image formats array with support for png/jpg/ktx files. Would anyone else be in favor of making this an extension to hold us over until a universal format is agreed upon and implemented? If so I will create a proposal and submit a PR.

dewilkinson commented 6 years ago

Hi all,

I've been following the above comments and proposals with interest - yes, we are currently working on a KHR_texture_transmission extension for the purpose of transporting compressed textures - via KTX or similar style container format - that would enable import and export of block-compressed texture assets within glTF2.0 scene data.

The full transmission extension is expected to also feature support for a universal transcodable format, along with proposed standardized RDO modes , LZ and rANS lossless encode stages for variable rate compression of texture data to approach jpeg-level compression ratios.

Myself and @richgel999 will be presenting an update during this upcoming 3DFormats call, Wed 23rd May. We would be happy to gather feedback and consensus as to whether we should pursue an interim extension purely for transmitting existing block formats via KTX without the universal format and extended compression and transmission modes.

Let's continue discussion on this topic in this forum, the 3DFormats group will also review and gather consensus on the appropriate direction to go from here regarding an interim compressed texture extension.

Kind Regards,

Dave Wilkinson Texture Transmission TSG

lexaknyazev commented 6 years ago

Closing this issue, since the path forward has been set. KTX2 spec (WIP): https://github.com/KhronosGroup/KTX-Specification/ Texture transmission tools: https://github.com/KhronosGroup/glTF-Texture-Transmission-Tools/

silvainSayduck commented 5 years ago

Hi,

I'm not sure if this is the right place to bring this up, but are video textures supported by glTF 2.0 or one if its extensions? Can anyone point me to the right thread or documentation if possible?

Thanks!

lufiaraujo commented 5 years ago

I'm working on a project where having video textures playing in the glTF model would be great. Is it supported?

atteneder commented 5 years ago

I'm working on a project where having video textures playing in the glTF model would be great. Is it supported?

No, at least there's nothing in about in in the 2.0 specification.

But Basis Universal was released recently and the people behind this texture format are experimenting with video already: https://twitter.com/richgel999/status/1135010615586578433 http://binomial.biz/TextureVideoTest2

So maybe Khronos should consider video textures as well when adding support for basisu to glTF?

richgel999 commented 5 years ago

Hi - We'll be merging texture video support in Basis Universal probably on Monday. This is an ongoing effort. We're still focusing on reducing encode times and improving color quantization quality.

Regards, -Rich

Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Sunday, June 2, 2019 2:34 AM, Andreas Atteneder notifications@github.com wrote:

I'm working on a project where having video textures playing in the glTF model would be great. Is it supported?

No, at least there's nothing in about in in the 2.0 specification.

But Basis Universal was released recently and the people behind this texture format are experimenting with video already: https://twitter.com/richgel999/status/1135010615586578433 http://binomial.biz/TextureVideoTest2

So maybe Khronos should consider video textures as well when adding support for basisu to glTF?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

shaykh-qasim commented 4 years ago

@richgel999 hi, any update on using video textures in GLTF??? or anyhow playing video in gltf?