MPEGGroup / MIAF

Official MPEG repository to discuss issues on MIAF (ISO/IEC 23000-22)
2 stars 0 forks source link

Change to 7.3.6.7 "Clean aperture, rotation and mirror" in MIAF ISO/IEC 23000-22:2019 DAmd 2 #8

Closed wantehchang closed 3 years ago

wantehchang commented 3 years ago

I would like to comment on a change to 7.3.6.7 "Clean aperture, rotation and mirror" in MIAF ISO/IEC 23000-22:2019 DAmd 2.

The file has Reference number ISO/IEC 23000-22:2019/DAM 2:2021(E). On page 6, it says:

In 7.3.6.7 “Clean aperture, rotation and mirror” replace The clean aperture (cropping) property may be associated with any image and shall be supported by the MIAF reader. The clean aperture property is restricted according to the chroma sampling format of the input image (4:4:4, 4:2:2:, 4:2:0, or 4:0:0); the cropping shall select an integer number of samples for all planes. In effect this means that: • when the image is 4:0:0 (monochrome) or 4:4:4, there is no restriction; • when the image is 4:2:2 the horizontal cropped offset and width shall be even numbers; • when the image is 4:2:0 both the horizontal and vertical cropped offsets and widths shall be even numbers. With The clean aperture (cropping) property may be associated with any image and shall be supported by the MIAF reader. The clean aperture property is restricted according to the chroma sampling format of the input image (4:4:4, 4:2:2:, 4:2:0, or 4:0:0) as follows: • when the image is 4:0:0 (monochrome) or 4:4:4, the horizontal and vertical cropped offsets and widths shall be integers; • when the image is 4:2:2 the horizontal cropped offset and width shall be even numbers and the vertical values shall be integers; • when the image is 4:2:0 both the horizontal and vertical cropped offsets and widths shall be even numbers.

I found several issues in this change while reviewing it. Please consider removing this change from Amendment 2 if there is not enough time to resolve the issues. It would be bad if an amendment needed to be amended.

Details:

The new text has some editorial issues:

More important, the new text seems partially incorrect. Specifically, if "the horizontal and vertical cropped offsets" refers to horizOff and vertOff, then it is incorrect to require that they shall be integers when the image is 4:0:0 or 4:4:4. The reason is that horizOff and vertOff" may need to be half-values, for example, to crop 8x8 to 7x7.

Also, I believe that cleanApertureWidth and cleanApertureHeight should be allowed to be odd numbers, even when the image is 4:2:2 or 4:2:0.

Finally, I think we should add the requiement that the leftmost pixel and the topmost line of the clean aperture shall be integers.

In summary, here are my proposed alternative requirements:

[1] For example, to crop 8x8 to 7x7, horizOff and vertOff must be half-values. [2] Defined at the end of ISO/IEC 14496-12:2020, Section 12.1.4.1.

Wan-Teh Chang wtc@google.com

dwsinger commented 3 years ago

More important, the new text seems partially incorrect. Specifically, if "the horizontal and vertical cropped offsets" refers to horizOff and vertOff, then it is incorrect to require that they shall be integers when the image is 4:0:0 or 4:4:4. The reason is that horizOff and vertOff" may need to be half-values, for example, to crop 8x8 to 7x7.

The original text had the same intention of disallowing this: "the cropping shall select an integer number of samples for all planes"

wantehchang commented 3 years ago

Hi David,

Thanks a lot for the reply. I suspect I misunderstood what the term "horizontal cropped offset" means. This term is used without definition in MIAF, and it is not used in HEIF or ISOBMFF. I would appreciate it if you could clarify it for me.

Please refer to ISOBMFF (ISO/IEC 14496-12:2020), Section 12.1.4.1, page 164. Does the term "horizontal cropped offset" in MIAF mean

  1. horizOff, or
  2. pcX - (cleanApertureWidth - 1)/2, i.e., the leftmost pixel of the clean aperture?

Thanks, Wan-Teh Chang

cconcolato commented 3 years ago

I did some archeology to dig where the issue came from. Here are the results:

In July 2019, contribution m49250 raised an issue saying:

Clause 7.3.6.7 about clean aperture property first states that cropping shall select an integer number of samples for all planes. In the bullet list it is then stated that there is no restrictions if image is 4:0:0 or 4:4:4.

I agree with this, there was a discrepancy between first part of the spec text at that time that said "cropping shall select an integer number of samples for all planes" and the second part "when the image is 4:0:0 (monochrome) or 4:4:4, there is no restriction;". But then, the contribution said:

As CleanApertureBox dimensions and offsets are fractions, it is proposed to emphasize that all values shall be integer numbers, also in case of 4:0:0 and 4:4:4 images.

This is wrong as @wantehchang's example demonstrates. You sometimes need half-pixels. The contribution ended in the "Technologies under Consideration document" N18689 which wrongly says:

Agreed, we rephrase to say that integers are used for 4:0:0 and 4:2:0

Then it carried over from meetings to meetings, until it was decided to move it to the WD of Amd2 N19272 in April 2020, and I traced it back to me providing this proposed text based on the above :( I don't remember where the changes to the other bullets came from. Note that the terms "horizontal cropped offset" existed already in the previous text.

I'm tempted to not be so explicit and simply say:

the cropping shall select an integer number of samples for all planes

without other details (i.e removing the bullet point list), but maybe that's not clear enough for validating purposes for example. In that case, I would say:

  • when the image is 4:0:0 (monochrome) or 4:4:4, cleanApertureWidth and cleanApertureHeight shall be integers; and horizOff (respectively vertOff) shall be an integer, unless cleanApertureWidth (respectively cleanApertureHeight) is odd in which case, when horizOff (respectively vertOff) is multiplied by 2, the result shall be an integer.

  • when the image is 4:2:2, cleanApertureHeight and vertOff shall follow the same constraints as in the 4:4:4 case. However,cleanApertureWidth shall be even; and horizOff shall be even, unless cleanApertureWidth is not a multiple of 4, in which case 'horizOff` shall be an integer.

  • when the image is 4:2:0, the constraints that apply to cleanApertureWidth (respectively horizOff) in the 4:2:2 case, apply to both cleanApertureHeight and cleanApertureWidth (respectively horizOff and vertOff).

Is this correct?

wantehchang commented 3 years ago

Hi Cyril,

Thank you very much for looking into this.

Dirk Farin, Joe Drago, and Jon Bauman have all worked out the exact conditions under which horizOff and vertOff need to be half-values. So they are more qualified than I am to review that part of your suggested wording.

Having said that, I think it would be better to write requirements on the leftmost pixel and topmost line of the clean aperture, instead of requirements on horizOff and vertOff. Let me explain why.

Ultimately we need the leftmost pixel and the topmost line of the clean aperture to be integers.

Note: ISOBMFF defines these two values using the following formulas:

In addition, if chroma is subsampled horizontally, the leftmost pixel of the clean aperture needs to be an even number. Similarly, if chroma is subsampled vertically, the topmost line of the clean aperture needs to be an even number. This is why I suggest we write requirements on the leftmost pixel and topmost line instead of horizOff and vertOff.

Note: Both Dirk Farin and I also noticed that cleanApertureWidth and cleanApertureHeight can be odd numbers when chroma is subsampled. (Similarly, output_width and output_height of an image grid can be odd numbers when chroma is subsampled.) But it is not necessary to address this issue in MIAF Amendment 2.

joedrago commented 3 years ago

I believe this snippet is incomplete:

unless cleanApertureWidth (respectively cleanApertureHeight) is odd

The image's dimensions are a part of the calculation of the aperture's center, so they must be taken into consideration here for even/odd. For example, a 5x5 image with a crop rectangle of 2x2 at a top/left coord of (1,1) would create clap offsets of -1/2. This is a legal, integer crop rect in which cleanApertureWidth and cleanApertureHeight are even, and yet the clap offsets are half-pixels.

I believe the correct check for whether or not horizOff and vertOff can be half-pixels (and still produce an integer crop rect) would be something like:

unless imageWidth+cleanApertureWidth (respectively imageHeight+cleanApertureHeight) is odd

baumanj commented 3 years ago

I agree with @wantehchang's idea of writing the requirements in terms of the clean aperture. To me, the clearest way to state the requirements for validity are (generalizing width and height to length since they apply similarly in each dimension):

With this update, other denominators of lengthOff should be invalid.

And then on top of that, the additional requirements for chroma subsampling still apply. I have nothing to add there.

cconcolato commented 3 years ago

Thanks for the feedback. I'm convinced that @wantehchang 's text is the best:

wantehchang commented 3 years ago

Joe, Jon: Thank you both for your comments.

Cyril: I have since realized that the second requirement "horizOff and vertOff shall be integers or half-values" is redundant. If we want a minimal set of requirements, we can omit "horizOff and vertOff shall be integers or half-values".

Also, the first requirement "cleanApertureWidth and cleanApertureHeight shall be integers" allows those two values to be odd numbers when the image is 4:2:2 or 4:2:0.

farindk commented 3 years ago

How do you think about rewording the first two items in @wantehchang 's proposal like this:

The background is that these values currently may be stored as arbitrary fractions (e.g. 4096/8192 instead of 1/2). When a reader has to do the computation on the fractions, it has to bring fractions to the same denominator (e.g. 66666/33333 + 4096/8192), which can easily lead to integer overflows. This could even lead to security issues when exploiting this to give negative or huge image sizes.

With the proposed restriction on the denominator, reader implementation could be simplified as the reader can easily validate the denominator to prevent overflows.

leo-barnes commented 3 years ago

@farindk

How do you think about rewording the first two items in @wantehchang 's proposal like this:

  • cleanApertureWidth and cleanApertureHeight shall be integers and shall be stored with the denominator equal to 1;
  • horizOff and vertOff shall be integers or half-values, and shall be stored with the denominator equal to 1 or 2;

The background is that these values currently may be stored as arbitrary fractions (e.g. 4096/8192 instead of 1/2). When a reader has to do the computation on the fractions, it has to bring fractions to the same denominator (e.g. 66666/33333 + 4096/8192), which can easily lead to integer overflows. This could even lead to security issues when exploiting this to give negative or huge image sizes.

With the proposed restriction on the denominator, reader implementation could be simplified as the reader can easily validate the denominator to prevent overflows.

While I agree in principle that this would be a nice simplifying restriction to have in the file, it will be a breaking change that would make existing files invalid and make existing MIAF writers potentially incorrect. We would need to investigate if this could cause problems for existing implementations before restricting it.

cconcolato commented 3 years ago

I understand the issue but we only want to fix the current amendment, with the same intent. We should not be introducing further restrictions at this stage.

dwsinger commented 3 years ago

Perhaps we should state the basic principle first, and then explore the consequences: the clap must select, on sample boundaries, an integer number of samples in all planes.

cconcolato commented 3 years ago

Perhaps we should state the basic principle first, and then explore the consequences: the clap must select, on sample boundaries, an integer number of samples in all planes.

That's what we had initially (see my comment for details), except that the detailed part after the basic principle was wrong.

cconcolato commented 3 years ago

Thanks all. This has been integrated in FDAM2 as indicated in https://github.com/MPEGGroup/MIAF/issues/8#issuecomment-844581748 with the removal of the unnecessary sentence indicated in https://github.com/MPEGGroup/MIAF/issues/8#issuecomment-844591326