KhronosGroup / glTF

glTF – Runtime 3D Asset Delivery
Other
7.38k stars 1.15k forks source link

Better define supported JPEG images #2471

Open lexaknyazev opened 3 months ago

lexaknyazev commented 3 months ago

glTF 2.0 specification requires that implementations support JPEG images compatible with the JFIF format.

The normative definition of JPEG compression is ITU-T Rec. T.81 (or identical ISO/IEC 10918-1). This standard defines various lossy and lossless compression techniques for abstract "image components", i.e., it does not say how these components should be mapped (or converted) to regular RGBA color channels that glTF expects. Also it does not require all implementations to support all encoding methods.

The JFIF standard is defined in ITU-T Rec T.871 (or identical ISO/IEC 10918-5). It imposes the following restrictions on JPEG data:

The JFIF standard also suggests use of JPEG baseline process for maximum compatibility.


The suggestion to use only the baseline process is not very practical. For instance, the Sample-Assets repo has 585 JPEG images and 125 out of them use progressive encoding instead of baseline without any issues. That said, there are certain JPEG encoding modes with very little application support. As of today, the glTF 2.0 specification does not explicitly disallow them thus an asset with such images is formally correct but unusable in practice.

Problematic features:

  1. Arithmetic coding. Not supported in Chrome and Firefox; some variants work in Safari.
  2. Lossless encoding. Not supported in Chrome and Safari; some configurations may work in Firefox.
  3. Hierarchical mode and DNL markers. Not supported in any browser, rarely supported by libraries.

Proposed glTF 2.0 update (for the section 2.6)

  • JPEG images MUST be compatible with JPEG File Interchange Format, namely:
    • JPEG images MUST have a JFIF APP0 marker right after the SOI marker.
    • JPEG images MUST have 1 or 3 image components with 8 bits per component
  • Additionally, for application compatibility:
    • JPEG images MUST use Huffman coding.
    • JPEG images MUST use Baseline DCT, Extended sequential DCT, or Progressive DCT signalled by SOF0, SOF1, or SOF2 markers respectively.
    • JPEG images MUST NOT use the hierarchical mode of operation.
    • JPEG images MUST NOT contain a DNL marker.

/cc @javagl @emackey

emackey commented 3 months ago

I like it.

Is there a way to know how many of the JPGs in glTF-Sample-Assets break the proposed rules here?

javagl commented 3 months ago

I see the point of trying to maximize compatibility. And there are no strong objections from my side to the proposed change. But as usual, this may be due to a lack of familiarity with the technical details.

On this level, it raises two related questions:

One example is that I have to assume that "Hierarchical" is what is commonly referred to as "Progressive" (right?), and looking at something like the export dialog of IrfanView, for example...

Image

that seems to be the only thing that ~"common image readers/writers" seem to offer that could make a file "invalid"...

lexaknyazev commented 3 months ago

Which tools are even capable of generating JPEG files that are "invalid"?

How can users ensure that these constraints are met, and the files are valid?

glTF-Validator will be able to trivially check for that. Besides, we could update the tutorials/guidelines with recommended options for popular image editors.

"Hierarchical" is what is commonly referred to as "Progressive" (right?)

No. JPEG compression has three relevant concepts: image, frame, and scan. There is always one image. It contains one or more frames. Each frame contains one or more scans. Progressive mode refers to multiple scans per frame; this is supported everywhere. Hierarchical mode refers to multiple frames per image; this is generally not supported.

emackey commented 3 months ago

How can users ensure that these constraints are met, and the files are valid?

glTF-Validator will be able to trivially check for that.

Can you make a dev branch for Validator that does precisely that, before the spec change becomes official? It would be illuminating to run a branch of the validator on a bunch of preexisting assets and see what kind of trouble we find. It would give us some tangible clues as to how well the existing ecosystem would coexist with an updated specification.

lexaknyazev commented 3 months ago

Sure, that's WIP.