Closed dzung-hoang closed 7 years ago
The problem with the three-channel method of Rec.2100 is that the inverse equation has no analytical solution.
Rd = a*Ys^(c-1)*Rs+b
Gd = a*Ys^(c-1)*Gs+b
Bd = a*Ys^(c-1)*Bs+b
Ys = Kr*Rs+Kg*Gs+Kb*Bs
Solving this non-linear system for Rs/Gs/Bs requires approximating Ys, since Ys is a function of Rs/Gs/Bs and not Rd/Gd/Bd. In my experiments, estimating Ys from the non-linear RGB components requires at least 3 iterations to converge at single precision.
As a lesser issue, the full method cannot be implemented with 1-D LUTs and requires more complex code to implement (and optimize).
EDIT: Note 5d of BT.2100 mentions the per-channel gamma as a "legacy display" system.
@dzung-hoang Do you have any further remarks on this subject? If the true Rec.2100 HLG transform is required, it can be implemented.
The point you raised about being able to compute the HLG inverse EOTF is valid. I haven't had time to look into exactly how to do it. Annex 2 of BT.2100 states:
The HLG inverse EOTF is the HLG inverse OOTF followed by the HLG OETF. For the HLG inverse OOTF, black level should be zero, and the gamma parameter is determined by the peak level of the PQ signal.
The HLG OETF is easy to compute, but the HLG inverse OOTF and the determining the gamma parameter are not obvious, as you pointed out.
This paper gives step-by-step instructions.
Do you need help to implement the BBC process?
I am reviewing the provided document. Can you provide reference images in linear (display) light and HLG for verification?
I was able to confirm the equation Yd = Ys^c
and will be implementing the process.
EDIT: But it is not clear to me how to determine the "nominal peak luminance" for calculating gamma. EDIT2: Maybe I can assume the HLG peak luminance is 12 times the SDR peak luminance.
The nominal peak luminance as per BT.2100 is 1000 cd/m², which as far as I can tell is what should be assumed when objective conversions are needed or in the absence of an actual display device. (This means the gamma value is 1.2)
Interestingly, this is not 12 times the SDR peak luminance (100 cd/m²), which is what you would expect.
@dzung-hoang the provided PDF link is a 404, do you have a recent version or can you upload the PDF elsewhere? I am also interested in implementing the HLG inverse OOTF.
I've looked into this a bit more, and it seems the 50% code value for HLG does not map to 1/12th of the peak luminance at the display. After applying the gamma function, it is actually 5% of peak luminance. I am currently reconsidering how SDR-HLG conversions should work in z.lib. Since HLG is also a relative function, perhaps SDR content should remap to occupy the entire HLG range.
Since HLG is also a relative function, perhaps SDR content should remap to occupy the entire HLG range.
What would the use case of such a conversion be? You're essentially proposing mapping the SDR 1.0 to HLG 1.0 which is by definition “brightest value the display can present”, but that only ever makes sense for SDR content when your display is actually a standard range display. But if you're assuming your display is standard range, then it doesn't make sense to be sending it HLG to begin with, and you'd gain nothing from this conversion.
As a further downside, if somebody then naively takes the output of this conversion and displays it on an actual HDR display (expecting such a conversion to be sensible), you would burn out their retinas (and maybe even their backlights) because a pure white screen (e.g. a title screen or fade-to-white) would now display at 1000 cd/m² everywhere, which is never the case for actual HDR content and certainly not what these displays are intended to do.
I think the only sane way to do conversions between SDR and HDR is to try and ensure they would generate the same output (i.e., if your clip shows up as 80 cd/m² on a SDR display, then it should show up as 80 cd/m² on a HDR display). Unfortunately, for HLG, this requires knowing the exact peak of the display in advance in order for this to round-trip. It also technically relies on the assumption that all SDR displays are calibrated to some fixed value like 100 cd/m², which is simply not the case. (Or in the absence of this assumption, knowing what brightness the target display is calibrated to, so you can exactly do the inverse mapping). And of course, it's further complicated by the fact that HLG is not easily reversible.
tl;dr there's no really objective way to convert SDR to HDR and it always relies on knowing more metadata about your target environment. So maybe the correct conclusion is that SDR->HDR conversion should not be allowed?..
The underlying issue is really that HLG has no well-defined diffuse white when display-referred. The 50% code-value where it transitions from a power function to a logarithm is only well-defined at the scene. We can try to match a physical luminance, but then the conversion fails when the same content is presented on a system with a different peak brightness, and hence a different gamma function. Alternatively, we can use a scene-referred conversion, but then we get the result that the same content gets darker if used on a brighter display. Perhaps this is the intended behaviour of HLG?
The underlying issue is really that HLG has no well-defined diffuse white when display-referred.
That's true. Of course, PQ has the exact same problem.
We can try to match a physical luminance, but then the conversion fails when the same content is presented on a system with a different peak brightness, and hence a different gamma function.
Yep, basically HLG is not really a transfer function as much as it is a family of transfer functions indexed by the display's parameters (sort of like BT.1886, which has the exact same issue)
Alternatively, we can use a scene-referred conversion, but then we get the result that the same content gets darker if used on a brighter display.
In other words, instead of using the EOTF as the basis for conversion (E_sig --EOTF--> O_display --EOTF^-1--> E'_sig), you could use the OETF as the basis of conversion (E_sig --OETF^1--> O_scene --OETF--> E'_sig).
This has the benefit that the HLG OETF is well-defined, and the PQ OETF is well-defined (note: it includes the OOTF which is basically equivalent to the BT.1886/BT.709 interaction for a monitor at 100 cd/m²), and as a bonus, it also has a well-defined form for BT.709/BT.2020 instead of having to rely on the implicit BT.1886.
Unfortunately, while this path of conversion sounds nice on paper, and it preserves the round tripping properties, this means that converting from one curve to another and then display it on a non-ideal / non-reference monitor will end up looking different for each curve; and more importantly, artist adjustments to the OOTF during grading will not be preserved as intended, since these are assumed in practice to be done as part of the EOTF.
Perhaps this is the intended behaviour of HLG?
Well the only really intended behavior of HLG is to “bake” a naive form of (poor quality) tone mapping into the transfer function to make it both backwards compatible with SDR displays and make it easy to implement on HDR displays of different capabilities. It seems pretty evident that the developers of HLG were perfectly acceptable with giving up the ability to “reverse” HLG in order to accomplish this goal. So I think in practice, we probably just have to respect this limitation and not try and upconvert SDR->HDR unless we know the target display's peak.
Incidentally, here are the details of such an “OETF-based” conversion scheme:
This 59.5208 figure comes from the fact that the EOTF[OETF[1/59.5208]] = 100 cd/m² (using the definition of OETF based on the range 0-1 as given in the spec), which we assume maps to the same value as 1.0 in SDR/HLG systems. (More specifically, this constant shows up in the OOTF definition: It's multiplied into the incoming signal to scale it up to the value range of the BT.709/BT.1886 functions)
The problem with the three-channel method of Rec.2100 is that the inverse equation has no analytical solution. Rd = aYs^(c-1)Rs+b Gd = aYs^(c-1)Gs+b Bd = aYs^(c-1)Bs+b Ys = KrRs+KgGs+Kb*Bs
I think we can use the following method: Let us denote R1 = (Rd - b) / a G1 = (Gd - b) / a B1 = (Bd - b) / a Y1 = Kr R1 + Kg G1 + Kb * B1 Let Y2 = Y1^((c - 1) / c) In this case Rs = R1 / Y2 Gs = G1 / Y2 Bs = B1 / Y2
@rusxg Can you prove that this is the inverse? It seems to succeed on unit tests, but I'm struggling to understand how taking the dot product to the power (c - 1) / c
somehow magically multiplies out the coefficient Ys^(c-1)
.
@haasn I'll try.
From Rd = aYs^(c-1)Rs+b we may express
Rs = (Rd - b) / a / Ys^(c-1) = R1 / Ys^(c-1). On the other hand, Rs = R1 / Y2. So our goal is to prove that Ys^(c - 1) = Y2
Ys = Kr Rs + Kg Gs + Kb Bs = Kr R1 / Y2 + Kg G1 / Y2 + Kb B1 / Y2 = (Kr R1 + Kg G1 + Kb * B1) / Y2 = Y1 / Y2
Ys^(c - 1) = (Y1 / Y2)^(c - 1) = Y1^(c - 1) / Y2^(c - 1) = Y1^(c - 1) / ((Y1 ^((c - 1) / c))^(c - 1)) = = Y1 ^ (c - 1 - ((c - 1)^2 / c) = Y1 ^ ((c^2 - c - c^2 + 2c - 1) / c) = Y1 ^ ((c - 1) / c) = Y2
@rusxg Can you repost your proof with proper formatting (perhaps as an image)? I am interested in your findings.
@sekrit-twc I'm not the best TeX writer, but probably attached doc will make the text more readable HLG OOTF Inverse - main.pdf
@rusxg In the last step, you seem to expand (c - 1)²
as c²+2c-1
, shouldn't it be c²-2c+1
?
Oh, never mind, the sign got flipped because of the -
in front of the (c-1)²
.
@rusxg 's math seems to work out. I created a PR for this with the following decisions:
(1) HLG will be treated as an absolute transfer function based on 1000 cd/m^2 in the display-referred case. (2) The 3-channel method will only be used when fast gamma operations are disabled.
Need to find test content to evaluate this.
It seems the most recent edition of BT.2100 confirms the inverse equation: https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.2100-1-201706-I!!PDF-E.pdf
For HLG (ARIB-STD-B67) transfer function, you did not implement BT.2100 OOTF when the scene_referred flag is false. Instead, you use arib_b67_eotf() with the comment "Applies a per-channel correction instead of the iterative method in Rec.2100."
Can you justify this choice? Can you cite a reference for the per-channel correction? The "iterative method" that you refer to is not really iterative in my view. My interpretation is that BT.2100 recommends first converting R', G', B' to R_S, G_S, B_S and then converting to R_D, G_D, B_D. This makes the conversion truly display-referred, which is opposite of scene-referred. The key difference between scene-referred and display-referred is that Y_S is used to model the non-linear display processing intended by HLG. This intended non-linearity of HLG-compatible HDR display is to achieve partial backwards compatible with SDR displays.
In Annex 2 of BT.2100, it is explicitly stated that conversion between HLG and PQ and vice versa need to go through display light and not scene light.