w3c / png

Maintenance of the PNG specification
https://w3c.github.io/png/
Other
44 stars 11 forks source link

MaxCLL and MaxFALL have no references #337

Closed svgeesus closed 1 year ago

svgeesus commented 1 year ago

Shouldn't we add a reference to CTA 861.3-A-2016 which is freely downloadable. It even has normative calculation methods in an annex.

fintelia commented 1 year ago

That annex is almost entirely just devoted to pseudo code for calculating averages and maximums. As far as how to calculate the light level values, the annex literally just says:

convert the pixel’s non-linear (R’,G’,B’) values to linear values (R,G,B) calibrated to cd/m2

My understanding is that the non-linear to linear conversion can be done based on the contents of the cICP, sRGB, or gAMA chunks (perhaps with a fallback to gamma=2.2 or sRGB if none are present).

But the proper strategy for then scaling the output from 0-1 range into a cd/m2 value is less clear. The mDCv chunk is the only one other than cLLi mentioning cd/m2, so presumably the range is sourced from there. But do you convert by linearly interpolating between min and max mastering display luminance? Or perhaps the minimum luminance is only informational and you actually just multiply by the max display luminance? And what if that other chunk isn't present?

michaeldsmith commented 1 year ago

Our open-access SMPTE article [1] "On the Calculation and Usage of HDR Static Content Metadata" has some good discussion of how to set MaxCLL and MaxFALL

[1] https://ieeexplore.ieee.org/document/9508136

michaeldsmith commented 1 year ago

In general, if you don't know how to set MaxCLL and MaxFALL, they should not be set. Default values aren't helpful to the metadata ecosystem. Any values that could be guessed at the file creation stage from other file properties could also likely be guessed in a decoding process from file properties. If a value must be set and is setter does not know the correct value, there is a special value 0 that is reserved to mean "unknown". If a default value were used instead, then a receiver that receives a file containing the default value won't know if that value is actually the correct value or if the creator didn't know what value to use and thus used the default value. This creates an unreliable metadata ecosystem, which is avoided when using the value 0 that means "unknown".

fintelia commented 1 year ago

For a PNG encoder, I agree that it is worth emphasizing that saying "unknown" is far better than making up values.

But for a PNG viewer, the plan is to say the contents of the cLLi chunk SHALL be used for tone mapping (#319). Which means that authors of decoders/viewers probably do need to know the meanings of the fields in the chunk if they want to follow that recommendation. But since the fields are defined in terms of how they're calculated, those authors need to understand the calculation process!

Even a short non-normative description and/or a link to an (ideally freely available) reference summarizing the ful conversion process between integer pixel values and light levels could go a long way

svgeesus commented 1 year ago

the proper strategy for then scaling the output from 0-1 range into a cd/m2 value is less clear.

It is only defined for color models that use an absolute luminance level. Which means BT.2100 PQ, (and Jzazbz and ICtCp, but those are not RGB models so can't be used in PNG anyway). So maybe we should be more explicit that these are for PQ content.

Most RGB models use relative luminance, and so they should not have MaxCLL and MaxFALL anyway.

svgeesus commented 1 year ago

@michaeldsmith PNG has a bunch of informative references to tutorial material and your article would make an excellent addition to that!

svgeesus commented 1 year ago

But the proper strategy for then scaling the output from 0-1 range into a cd/m2 value is less clear. The mDCv chunk is the only one other than cLLi mentioning cd/m2, so presumably the range is sourced from there. But do you convert by linearly interpolating between min and max mastering display luminance? Or perhaps the minimum luminance is only informational and you actually just multiply by the max display luminance? And what if that other chunk isn't present?

This is valuable feedback and we should ensure that other readers of the PNG specification don't have to similarly try to read between the lines to figure out how to do this. So, for a PNG decoder:

A validator like PNGcheck could usefully give a warning if SDR or relative HDR content (BT.2100 HLG) contains cLLi.

OK let's do advice for a PNG encoder:

The spec has advice for when only one of MaxCLL or MaxFALL is known (write zero). I would like to check that knowing only one is actually useful. We should also say that if you don't know both, don't write the chunk at all rather than writing two zeroes.

digitaltvguy commented 1 year ago

Shouldn't we add a reference to CTA 861.3-A-2016 which is freely downloadable. It even has normative calculation methods in an annex.

Created Pull request to address #337 (see #339)

simontWork commented 1 year ago

@svgeesus , did you mean: "you must write an cICP chunk because iCCP can't usefully encode this" ?

simontWork commented 1 year ago

Just checking that a scenario where the encoder can't access every frame in the PNG to calculate MaxFALL and MaxCLL isn't possible? e.g. a live video to PNG converter.

If it is, does this need the use of 0 for MaxFALL and MaxCLL mandated?

svgeesus commented 1 year ago

If it is, does this need the use of 0 for MaxFALL and MaxCLL mandated?

Is there a benefit in this rather than just not writing the chunk? Although perhaps, such a streaming encoder could go back and edit the values once the stream is ended.

simontWork commented 1 year ago

When discussing pixel luminance values in the specification, it is not obvious if the achromatic Y or the maximum of R, G or B for that pixel is meant.

svgeesus commented 1 year ago

In general luminance in the PNG spec should mean Y (calculated from linear-light R G B) as it does everywhere else in color science; and if we mean max(R' G' B') we should say so explicitly (and not call it luminance).

michaeldsmith commented 1 year ago

@michaeldsmith PNG has a bunch of informative references to tutorial material and your article would make an excellent addition to that!

@svgeesus - please feel free to reference the article

michaeldsmith commented 1 year ago

In general luminance in the PNG spec should mean Y (calculated from linear-light R G B) as it does everywhere else in color science; and if we mean max(R' G' B') we should say so explicitly (and not call it luminance).

@svgeesus - the MaxCLL and MaxFALL definitions use max() operator over the linear-light values R G B, not non-linear values R'G'B'. But since max() is robust to monotonically increasing non-linearities like the PQ EOTF, I believe it is equivalent to compute max(R,G,B) or max( PQ_EOTF(R'), PQ_EOTF(G'), PQ_EOTF(B') ) or PQ_EOTF( max(R',G',B) ).

Instead of calling the resulting value a "luminance" we called the result a "light level" since it is not a luminance, except in some cases when the pixel values are on the gray axis pixels R=G=B. I think if you use "MaxRGB light level", it is most clear when talking about MaxCLL and MaxFALL values.

@ simontWork @svgeesus - What are the units of the result of PQ_EOTF(R'), or PQ_EOTF(G') or PQ_EOTF(G') ? I think it is a bit awkward to talk about.

In BT.2100, see Footnote 4b

Note 4b – In this Recommendation, when referring to the luminance of a single colour component (RD, GD,
BD), it means the luminance of an equivalent achromatic signal with all three colour components having that
same value.

In SMPTE ST2084, there is a whole section 4.3 about it

image

michaeldsmith commented 1 year ago

The spec has advice for when only one of MaxCLL or MaxFALL is known (write zero). I would like to check that knowing only one is actually useful. We should also say that if you don't know both, don't write the chunk at all rather than writing two zeroes.

I think knowing MaxCLL but not knowing MaxFALL is still useful, as the MaxCLL value can be used to set an upper bound for tonemapping input values. For example, see the last section of the paper that I linked to above, Figure 14 shows how MaxCLL can be used for tonemapping.

image

svgeesus commented 1 year ago

Fixed by https://github.com/w3c/PNG-spec/pull/351