cICP chunk specification might be in disagreement with H273

veluca93 commented 2 years ago

The cICP chunk spec specifies 7 bytes in total, in particular 2 for each of Matrix Coefficients, Primaries and Transfer Curve, but H.273 mentions that one byte should be sufficient.

Is this just an editorial issue, or is there some deeper reason behind this choice?

svgeesus commented 2 years ago

The original cICP proposal specified these fields as one byte each. The issue also notes range 0 .. 255.

I couldn't find any discussion about extending these fields to 2 bytes each.

ProgramMax commented 2 years ago

IIRC, we originally proposed 1 byte because that is what JXL did. But I recall learning (perhaps incorrectly) that CICP would require two and JXL wasn't actually following CICP. Rather, it was close to CICP but different.

I was rereading H.273 yesterday to see if I missed anything when my power went out. It just came back. I'll continue reading and tell you what I find.

veluca93 commented 2 years ago

FWIW AVIF seems to use 8 bits too: https://github.com/AOMediaCodec/libavif/blob/dd2ef69cc98b5dc7430da2580a97389a6d9bf12f/src/obu.c#L263

Also it seems that their syntax does not allow expressing limited range for sRGB data (if I read the code right), but that depends on the specific primaries.

svgeesus commented 2 years ago

we originally proposed 1 byte because that is what JXL did.

I thought it was because that is what H.273 specifies?

The JPEG-XL library seems to accept a small set of the H.273 values, as enumerations, but then extends that set with values greater than 255:

JXL_TRANSFER_FUNCTION_GAMMA = 65535,

veluca93 commented 2 years ago

we originally proposed 1 byte because that is what JXL did.

I thought it was because that is what H.273 specifies?

The JPEG-XL library seems to accept a small set of the H.273 values, as enumerations, but then extends that set with values greater than 255:
JXL_TRANSFER_FUNCTION_GAMMA = 65535,

That representation does not correspond to the bitstream implementation though :)

wantehchang commented 2 years ago

CICP (H.273 and ISO/IEC 23091-2:2021) specifies that the range of Colour primaries, Transfer characteristics, and Matrix coefficients (excluding the associated VideoFullRangeFlag) is all 0 - 255. So these three values can all be stored in one-byte fields.

ISO BMFF (ISO/IEC 14496-12:2022) stores these three values in two-byte fields in the 'colr' box of the 'nclx' type. See Section 12.1.5.2:

class ColourInformationBox extends Box('colr'){
   unsigned int(32) colour_type;
   if (colour_type == 'nclx') /* on-screen colours */
   {
      unsigned int(16) colour_primaries;
      unsigned int(16) transfer_characteristics;
      unsigned int(16) matrix_coefficients;
      unsigned int(1) full_range_flag;
      unsigned int(7) reserved = 0;
   }
   else if (colour_type == 'rICC')
   {
      ICC_profile; // restricted ICC profile
   }
   else if (colour_type == 'prof')
   {
      ICC_profile; // unrestricted ICC profile
   }
}

So using two bytes for these three fields in https://w3c.github.io/PNG-spec/#11cICP most likely was influenced by the 'colr' box of ISO BMFF.

ProgramMax commented 2 years ago

I finally have power again and am able to reread H.273. Some things I found:

§ 7.2 says "The definition of their encoding and representation (e.g., as a binary number) is the responsibility of the specification using the code point..." and "It is also possible for external specifications to use a mapping to values defined here if they wish to preserve identical semantics, but different code point assignments." So I suppose any size and mapping is allowed.
§ 7.2 also says "Guidance is given for each code point with regard to a suitable type (e.g., unsigned integer) and a suitable value range (e.g., 0–63)". I would prefer to stick with these. Given this issue was even raised I assume all the rest of us do, too.
§ 8.1 Colour primaries -- "Range: 0-255" and "An 8-bit field should be adequate"
§ 8.2 Transfer characteristics -- "Range: 0-255" and "An 8-bit field should be adequate"
§ 8.3 Matrix coefficients -- "Range: 0-255, plus associated flag". The flag is the full/limited range flag. "An 8-bit field should be adequate"

All of this confirms what Wan-Teh Chang said. I don't have a copy of ISO/IEC 14496-12:2022. I suppose I'll get a copy. Even if we completely intend to ignore it and only follow CICP, it would be good to have read publications adjacent to what we're trying to accomplish. But I'll do that later.

Then I wondered "Where did we get the idea it was something other than a byte each?" I found this issue which mentions 7 bytes is required. But that is all I found. @palemieux you were mentioned there. Do you know how we can track it further back?

For now, I'm happy making all of these one byte.

palemieux commented 2 years ago

Do you know how we can track it further back?

It looks like several ISO/ITU image standards have gotten into the habit of using 2 bytes, e.g. ITU-T T.814 | ISO/IEC 15444-15:

It is not clear why.

I think the argument for 2 byte is probably that it enhances compatibility with ISO/ITU standards. Perhaps we should NOTE that explicitly in the document.

ProgramMax commented 2 years ago

Do we even want to do 2 bytes though?

palemieux commented 2 years ago

Do we even want to do 2 bytes though?

I would not object to a single byte.

veluca93 commented 2 years ago

I think I'd personally prefer 1 byte, but no strong opinion.

That said, I find the "decimal" value in the current wording ("The cICP chunk contains four decimal values [...]") to be somewhat confusing -- as far I can see it is only used in the spec to refer to numbers represented in the text, and never for the bitstream itself. It also makes someone wonder about whether the values are supposed to be stored as base-10 ASCII ;)

palemieux commented 2 years ago

I took a stab at cleanin-up the cICP chunk, including constraining the fields to 1 byte each, per the discussion above.

svgeesus commented 2 years ago

I find the "decimal" value in the current wording ("The cICP chunk contains four decimal values [...]") to be somewhat confusing -- as far I can see it is only used in the spec to refer to numbers represented in the text, and never for the bitstream itself. It also makes someone wonder about whether the values are supposed to be stored as base-10 ASCII ;)

What it is trying to say is that those four bytes contain ASCII characters, and here are their decimal values (to make it clear how they are encoded; in the early days of the Web there were all sorts of text encodings flying around).

svgeesus commented 2 years ago

So using two bytes for these three fields in https://w3c.github.io/PNG-spec/#11cICP most likely was influenced by the 'colr' box of ISO BMFF.

I expect @cconcolato would know why ISO BMFF uses 2 bytes for these 0..255 enumeration values.

svgeesus commented 2 years ago

Puzzled by this from the abstract of RP 2077:2013:

These mappings are not generally applicable to mapping of source narrow-range images, e.g. studio range images, to full-range images.

Isn't that exactly what we are citing it for?

(I don't have a spec purchase budget for paywalled specs, so haven't read the actual spec)

veluca93 commented 2 years ago

I find the "decimal" value in the current wording ("The cICP chunk contains four decimal values [...]") to be somewhat confusing -- as far I can see it is only used in the spec to refer to numbers represented in the text, and never for the bitstream itself. It also makes someone wonder about whether the values are supposed to be stored as base-10 ASCII ;)

What it is trying to say is that those four bytes contain ASCII characters, and here are their decimal values (to make it clear how they are encoded; in the early days of the Web there were all sorts of text encodings flying around).

That works for the cICP characters, and it makes sense there; I was referring to...

The cICP chunk contains four decimal values corresponding to the colour primaries, transfer function, matrix coefficients and video signal range flag for the source imagery.

palemieux commented 2 years ago

Isn't that exactly what we are citing it for?

The idea is to cite SMPTE RP 2077 since the mapping between narrow- and full-range images is complex.

For example, it is generally possible to map a narrow-range image to a full-range image only if the narrow-range image came from a full-range image.

The note should probably say: "see SMPTE RP 2077 for additional information on mapping between narrow- and full-range images."

w3c / png

cICP chunk specification might be in disagreement with H273 #129