Closed wantehchang closed 3 years ago
More important, the new text seems partially incorrect. Specifically, if "the horizontal and vertical cropped offsets" refers to
horizOff
andvertOff
, then it is incorrect to require that they shall be integers when the image is 4:0:0 or 4:4:4. The reason is thathorizOff
andvertOff
" may need to be half-values, for example, to crop 8x8 to 7x7.
The original text had the same intention of disallowing this: "the cropping shall select an integer number of samples for all planes"
Hi David,
Thanks a lot for the reply. I suspect I misunderstood what the term "horizontal cropped offset" means. This term is used without definition in MIAF, and it is not used in HEIF or ISOBMFF. I would appreciate it if you could clarify it for me.
Please refer to ISOBMFF (ISO/IEC 14496-12:2020), Section 12.1.4.1, page 164. Does the term "horizontal cropped offset" in MIAF mean
horizOff
, orpcX - (cleanApertureWidth - 1)/2
, i.e., the leftmost pixel of the clean aperture?Thanks, Wan-Teh Chang
I did some archeology to dig where the issue came from. Here are the results:
In July 2019, contribution m49250 raised an issue saying:
Clause 7.3.6.7 about clean aperture property first states that cropping shall select an integer number of samples for all planes. In the bullet list it is then stated that there is no restrictions if image is 4:0:0 or 4:4:4.
I agree with this, there was a discrepancy between first part of the spec text at that time that said "cropping shall select an integer number of samples for all planes" and the second part "when the image is 4:0:0 (monochrome) or 4:4:4, there is no restriction;". But then, the contribution said:
As CleanApertureBox dimensions and offsets are fractions, it is proposed to emphasize that all values shall be integer numbers, also in case of 4:0:0 and 4:4:4 images.
This is wrong as @wantehchang's example demonstrates. You sometimes need half-pixels. The contribution ended in the "Technologies under Consideration document" N18689 which wrongly says:
Agreed, we rephrase to say that integers are used for 4:0:0 and 4:2:0
Then it carried over from meetings to meetings, until it was decided to move it to the WD of Amd2 N19272 in April 2020, and I traced it back to me providing this proposed text based on the above :( I don't remember where the changes to the other bullets came from. Note that the terms "horizontal cropped offset" existed already in the previous text.
I'm tempted to not be so explicit and simply say:
the cropping shall select an integer number of samples for all planes
without other details (i.e removing the bullet point list), but maybe that's not clear enough for validating purposes for example. In that case, I would say:
when the image is 4:0:0 (monochrome) or 4:4:4,
cleanApertureWidth
andcleanApertureHeight
shall be integers; andhorizOff
(respectivelyvertOff
) shall be an integer, unlesscleanApertureWidth
(respectivelycleanApertureHeight
) is odd in which case, whenhorizOff
(respectivelyvertOff
) is multiplied by 2, the result shall be an integer.when the image is 4:2:2,
cleanApertureHeight
andvertOff
shall follow the same constraints as in the 4:4:4 case. However,cleanApertureWidth
shall be even; andhorizOff
shall be even, unlesscleanApertureWidth
is not a multiple of 4, in which case 'horizOff` shall be an integer.when the image is 4:2:0, the constraints that apply to
cleanApertureWidth
(respectivelyhorizOff
) in the 4:2:2 case, apply to bothcleanApertureHeight
andcleanApertureWidth
(respectivelyhorizOff
andvertOff
).
Is this correct?
Hi Cyril,
Thank you very much for looking into this.
Dirk Farin, Joe Drago, and Jon Bauman have all worked out the exact conditions under which horizOff
and vertOff
need to be half-values. So they are more qualified than I am to review that part of your suggested wording.
Having said that, I think it would be better to write requirements on the leftmost pixel and topmost line of the clean aperture, instead of requirements on horizOff
and vertOff
. Let me explain why.
Ultimately we need the leftmost pixel and the topmost line of the clean aperture to be integers.
Note: ISOBMFF defines these two values using the following formulas:
pcX - (cleanApertureWidth - 1)/2
pcY - (cleanApertureHeight - 1)/2
In addition, if chroma is subsampled horizontally, the leftmost pixel of the clean aperture needs to be an even number. Similarly, if chroma is subsampled vertically, the topmost line of the clean aperture needs to be an even number. This is why I suggest we write requirements on the leftmost pixel and topmost line instead of horizOff
and vertOff
.
Note: Both Dirk Farin and I also noticed that cleanApertureWidth
and cleanApertureHeight
can be odd numbers when chroma is subsampled. (Similarly, output_width
and output_height
of an image grid can be odd numbers when chroma is subsampled.) But it is not necessary to address this issue in MIAF Amendment 2.
I believe this snippet is incomplete:
unless
cleanApertureWidth
(respectivelycleanApertureHeight
) is odd
The image's dimensions are a part of the calculation of the aperture's center, so they must be taken into consideration here for even/odd. For example, a 5x5
image with a crop rectangle of 2x2
at a top/left coord of (1,1) would create clap offsets of -1/2
. This is a legal, integer crop rect in which cleanApertureWidth
and cleanApertureHeight
are even, and yet the clap offsets are half-pixels.
I believe the correct check for whether or not horizOff
and vertOff
can be half-pixels (and still produce an integer crop rect) would be something like:
unless
imageWidth+cleanApertureWidth
(respectivelyimageHeight+cleanApertureHeight
) is odd
I agree with @wantehchang's idea of writing the requirements in terms of the clean aperture. To me, the clearest way to state the requirements for validity are (generalizing width
and height
to length
since they apply similarly in each dimension):
lengthOff
is a whole number (denominator == 1), length
and cleanApertureLength
must have the same parity (both even or both odd)lengthOff
is a half value (denominator == 2), length
and cleanApertureLength
must have the opposite parity (one even, the other odd)With this update, other denominators of lengthOff
should be invalid.
And then on top of that, the additional requirements for chroma subsampling still apply. I have nothing to add there.
Thanks for the feedback. I'm convinced that @wantehchang 's text is the best:
cleanApertureWidth
and cleanApertureHeight
shall be integers;horizOff
and vertOff
shall be integers or half-values;Joe, Jon: Thank you both for your comments.
Cyril: I have since realized that the second requirement "horizOff
and vertOff
shall be integers or half-values" is redundant. If we want a minimal set of requirements, we can omit "horizOff
and vertOff
shall be integers or half-values".
Also, the first requirement "cleanApertureWidth
and cleanApertureHeight
shall be integers" allows those two values to be odd numbers when the image is 4:2:2 or 4:2:0.
How do you think about rewording the first two items in @wantehchang 's proposal like this:
cleanApertureWidth
and cleanApertureHeight
shall be integers and shall be stored with the denominator equal to 1;horizOff
and vertOff
shall be integers or half-values, and shall be stored with the denominator equal to 1 or 2;The background is that these values currently may be stored as arbitrary fractions (e.g. 4096/8192 instead of 1/2). When a reader has to do the computation on the fractions, it has to bring fractions to the same denominator (e.g. 66666/33333 + 4096/8192), which can easily lead to integer overflows. This could even lead to security issues when exploiting this to give negative or huge image sizes.
With the proposed restriction on the denominator, reader implementation could be simplified as the reader can easily validate the denominator to prevent overflows.
@farindk
How do you think about rewording the first two items in @wantehchang 's proposal like this:
cleanApertureWidth
andcleanApertureHeight
shall be integers and shall be stored with the denominator equal to 1;horizOff
andvertOff
shall be integers or half-values, and shall be stored with the denominator equal to 1 or 2;The background is that these values currently may be stored as arbitrary fractions (e.g. 4096/8192 instead of 1/2). When a reader has to do the computation on the fractions, it has to bring fractions to the same denominator (e.g. 66666/33333 + 4096/8192), which can easily lead to integer overflows. This could even lead to security issues when exploiting this to give negative or huge image sizes.
With the proposed restriction on the denominator, reader implementation could be simplified as the reader can easily validate the denominator to prevent overflows.
While I agree in principle that this would be a nice simplifying restriction to have in the file, it will be a breaking change that would make existing files invalid and make existing MIAF writers potentially incorrect. We would need to investigate if this could cause problems for existing implementations before restricting it.
I understand the issue but we only want to fix the current amendment, with the same intent. We should not be introducing further restrictions at this stage.
Perhaps we should state the basic principle first, and then explore the consequences: the clap must select, on sample boundaries, an integer number of samples in all planes.
Perhaps we should state the basic principle first, and then explore the consequences: the clap must select, on sample boundaries, an integer number of samples in all planes.
That's what we had initially (see my comment for details), except that the detailed part after the basic principle was wrong.
Thanks all. This has been integrated in FDAM2 as indicated in https://github.com/MPEGGroup/MIAF/issues/8#issuecomment-844581748 with the removal of the unnecessary sentence indicated in https://github.com/MPEGGroup/MIAF/issues/8#issuecomment-844591326
I would like to comment on a change to 7.3.6.7 "Clean aperture, rotation and mirror" in MIAF ISO/IEC 23000-22:2019 DAmd 2.
The file has Reference number ISO/IEC 23000-22:2019/DAM 2:2021(E). On page 6, it says:
I found several issues in this change while reviewing it. Please consider removing this change from Amendment 2 if there is not enough time to resolve the issues. It would be bad if an amendment needed to be amended.
Details:
The new text has some editorial issues:
horizOff
andvertOff
. (The original text has the same issue.)More important, the new text seems partially incorrect. Specifically, if "the horizontal and vertical cropped offsets" refers to
horizOff
andvertOff
, then it is incorrect to require that they shall be integers when the image is 4:0:0 or 4:4:4. The reason is thathorizOff
andvertOff
" may need to be half-values, for example, to crop 8x8 to 7x7.Also, I believe that
cleanApertureWidth
andcleanApertureHeight
should be allowed to be odd numbers, even when the image is 4:2:2 or 4:2:0.Finally, I think we should add the requiement that the leftmost pixel and the topmost line of the clean aperture shall be integers.
In summary, here are my proposed alternative requirements:
cleanApertureWidth
andcleanApertureHeight
shall be integers;horizOff
andvertOff
shall be integers or half-values[1];[1] For example, to crop 8x8 to 7x7,
horizOff
andvertOff
must be half-values. [2] Defined at the end of ISO/IEC 14496-12:2020, Section 12.1.4.1.Wan-Teh Chang wtc@google.com