Closed veluca93 closed 2 years ago
The original cICP
proposal specified these fields as one byte each. The issue also notes range 0 .. 255.
I couldn't find any discussion about extending these fields to 2 bytes each.
IIRC, we originally proposed 1 byte because that is what JXL did. But I recall learning (perhaps incorrectly) that CICP would require two and JXL wasn't actually following CICP. Rather, it was close to CICP but different.
I was rereading H.273 yesterday to see if I missed anything when my power went out. It just came back. I'll continue reading and tell you what I find.
FWIW AVIF seems to use 8 bits too: https://github.com/AOMediaCodec/libavif/blob/dd2ef69cc98b5dc7430da2580a97389a6d9bf12f/src/obu.c#L263
Also it seems that their syntax does not allow expressing limited range for sRGB data (if I read the code right), but that depends on the specific primaries.
we originally proposed 1 byte because that is what JXL did.
I thought it was because that is what H.273 specifies?
The JPEG-XL library seems to accept a small set of the H.273 values, as enumerations, but then extends that set with values greater than 255:
JXL_TRANSFER_FUNCTION_GAMMA = 65535,
we originally proposed 1 byte because that is what JXL did.
I thought it was because that is what H.273 specifies?
The JPEG-XL library seems to accept a small set of the H.273 values, as enumerations, but then extends that set with values greater than 255:
JXL_TRANSFER_FUNCTION_GAMMA = 65535,
That representation does not correspond to the bitstream implementation though :)
CICP (H.273 and ISO/IEC 23091-2:2021) specifies that the range of Colour primaries, Transfer characteristics, and Matrix coefficients (excluding the associated VideoFullRangeFlag) is all 0 - 255. So these three values can all be stored in one-byte fields.
ISO BMFF (ISO/IEC 14496-12:2022) stores these three values in two-byte fields in the 'colr' box of the 'nclx' type. See Section 12.1.5.2:
class ColourInformationBox extends Box('colr'){
unsigned int(32) colour_type;
if (colour_type == 'nclx') /* on-screen colours */
{
unsigned int(16) colour_primaries;
unsigned int(16) transfer_characteristics;
unsigned int(16) matrix_coefficients;
unsigned int(1) full_range_flag;
unsigned int(7) reserved = 0;
}
else if (colour_type == 'rICC')
{
ICC_profile; // restricted ICC profile
}
else if (colour_type == 'prof')
{
ICC_profile; // unrestricted ICC profile
}
}
So using two bytes for these three fields in https://w3c.github.io/PNG-spec/#11cICP most likely was influenced by the 'colr' box of ISO BMFF.
I finally have power again and am able to reread H.273. Some things I found:
All of this confirms what Wan-Teh Chang said. I don't have a copy of ISO/IEC 14496-12:2022. I suppose I'll get a copy. Even if we completely intend to ignore it and only follow CICP, it would be good to have read publications adjacent to what we're trying to accomplish. But I'll do that later.
Then I wondered "Where did we get the idea it was something other than a byte each?" I found this issue which mentions 7 bytes is required. But that is all I found. @palemieux you were mentioned there. Do you know how we can track it further back?
For now, I'm happy making all of these one byte.
Do you know how we can track it further back?
It looks like several ISO/ITU image standards have gotten into the habit of using 2 bytes, e.g. ITU-T T.814 | ISO/IEC 15444-15:
It is not clear why.
I think the argument for 2 byte is probably that it enhances compatibility with ISO/ITU standards. Perhaps we should NOTE that explicitly in the document.
Do we even want to do 2 bytes though?
Do we even want to do 2 bytes though?
I would not object to a single byte.
I think I'd personally prefer 1 byte, but no strong opinion.
That said, I find the "decimal" value in the current wording ("The cICP chunk contains four decimal values [...]") to be somewhat confusing -- as far I can see it is only used in the spec to refer to numbers represented in the text, and never for the bitstream itself. It also makes someone wonder about whether the values are supposed to be stored as base-10 ASCII ;)
I took a stab at cleanin-up the cICP chunk, including constraining the fields to 1 byte each, per the discussion above.
I find the "decimal" value in the current wording ("The cICP chunk contains four decimal values [...]") to be somewhat confusing -- as far I can see it is only used in the spec to refer to numbers represented in the text, and never for the bitstream itself. It also makes someone wonder about whether the values are supposed to be stored as base-10 ASCII ;)
What it is trying to say is that those four bytes contain ASCII characters, and here are their decimal values (to make it clear how they are encoded; in the early days of the Web there were all sorts of text encodings flying around).
So using two bytes for these three fields in https://w3c.github.io/PNG-spec/#11cICP most likely was influenced by the 'colr' box of ISO BMFF.
I expect @cconcolato would know why ISO BMFF uses 2 bytes for these 0..255 enumeration values.
Puzzled by this from the abstract of RP 2077:2013:
These mappings are not generally applicable to mapping of source narrow-range images, e.g. studio range images, to full-range images.
Isn't that exactly what we are citing it for?
(I don't have a spec purchase budget for paywalled specs, so haven't read the actual spec)
I find the "decimal" value in the current wording ("The cICP chunk contains four decimal values [...]") to be somewhat confusing -- as far I can see it is only used in the spec to refer to numbers represented in the text, and never for the bitstream itself. It also makes someone wonder about whether the values are supposed to be stored as base-10 ASCII ;)
What it is trying to say is that those four bytes contain ASCII characters, and here are their decimal values (to make it clear how they are encoded; in the early days of the Web there were all sorts of text encodings flying around).
That works for the cICP characters, and it makes sense there; I was referring to...
The cICP chunk contains four decimal values corresponding to the colour primaries, transfer function, matrix coefficients and video signal range flag for the source imagery.
Isn't that exactly what we are citing it for?
The idea is to cite SMPTE RP 2077 since the mapping between narrow- and full-range images is complex.
For example, it is generally possible to map a narrow-range image to a full-range image only if the narrow-range image came from a full-range image.
The note should probably say: "see SMPTE RP 2077 for additional information on mapping between narrow- and full-range images."
The cICP chunk spec specifies 7 bytes in total, in particular 2 for each of Matrix Coefficients, Primaries and Transfer Curve, but H.273 mentions that one byte should be sufficient.
Is this just an editorial issue, or is there some deeper reason behind this choice?