mpv-player / mpv

🎥 Command line video player
https://mpv.io
Other
27.91k stars 2.87k forks source link

gAMA png chunk for BT.1886 screenshots incompatible outside Apple ecosystem #13438

Open mightyhuhn opened 7 months ago

mightyhuhn commented 7 months ago

Important Information

mpv version mpv-x86_64-v3-20231203-git-f551a9d Windows Version 10.0.19044 Build 19044 Source of the mpv binary updater.bat GPU model, driver and version 4060 551.23 image

Reproduction steps

take a screenshoot with vo=GPU see a sane result like this: image

switch to vo=gpu-next image see things like 1.961

Expected behavior

the same meta data

Actual behavior

gpu-next writes an odd gamma value

Log file

output.txt

if a program actually takes these numbers serious the results could be catastrophic

ghost commented 7 months ago

--screenshot-tag-colorspace=no

mightyhuhn commented 7 months ago

yes that does remove the fields but that doesn't make the default difference and setting "sane".

how does it even come up with 1.961 is that the effective gamma of bt 1886 at 1000 CR? and if that's the case what does this have to do with and in a screenshoot.

ghost commented 7 months ago

The "1.961" value is an approximation of a BT.1886 curve which is the EOTF that mpv defaults to. If you open the images in mpv and check the info screen for each of those images, you'll see that the image that was generated with vo_gpu_next is tagged as Transfer: bt.1886 and the vo_gpu generated image is tagged as Transfer: sRGB.

vo_gpu basically just converts it to sRGB. If you want to convert pngs to sRGB in vo_gpu_next, you'll have to disable colorspace tagging, which is mostly important for preserving HDR in screenshots without tonemapping them.

mightyhuhn commented 7 months ago

so it intentionally doing it wrong?

if this png is really supposed to be bt.1886 then there are only 2 possible values for gamma the correct one bt.1886 which is most likely not possible because it needs to be a number the alternative is 2.4 everything else is wrong in this case.

the next issue is effective gamma != gamma meaning the number is out of context.

let's now assume i have a bt.1886 calibrated device with a CR of 1000 here and i take this 1.961 gamma serious to display it correctly i know need to convert this to 2.4 while the correct response to display it is do nothing. because BT.1886 is not 1.961 and has nothing todo with that a screenshot for a BT.1886 device is 2.4 and nothing else if the device has a CR of 100 or inf doesn't matter. and the screen is not changes in terms of transfer so it is 2.4.

your manual agrees:

auto Disable any adaptation, except for atypical transfers. Specifically, HDR or linear light source material gets automatically converted to gamma 2.2, while SDR content is not touched. (default) bt.1886 ITU-R BT.1886 curve (assuming infinite contrast) so "1.961" is just that nonesense.

vo_gpu sRGB writing 2.2 also doesn't make sense because gamma 2.2 and sRGB are mutually exclusive to each other sRGB is bt.709 (ignore a lot of the sRGB stuff here) with a special transfer of linear 0.04045 followed by gamma 2.4 which is an effective gamma of 2.2 but that doesn't mean it is the gamma. see atop.

the vo_gpu is not total nonesense because it actually changes the image i have not checked it in detail.

personal opinion it shouldn't by default change the transfer curve but the image is not "wrong" except for the 2.2 meta info i guess.

Traneptora commented 7 months ago

so it intentionally doing it wrong?

The metadata is not incorrect. You need to look at the chunks themselves. vo_gpu converts the pixel data to sRGB, and it writes an sRGB chunk. The gamma tag of 2.2 is from the PNG gAMA chunk, which exists solely as a fallback for PNG viewers that don't understand sRGB. It isn't used by any viewer that understands sRGB, and the specification requires that it be ignored by viewers that understand sRGB.

The same is true about the vo_gpu_next screenshot, although it doesn't convert to sRGB first. Instead, it writes a cICP chunk, which contains the color data, and the transfer function, which in this case is BT.709. It also writes a gAMA chunk, as a fallback, which is also required to be ignored by any viewer that understands cICP.

The way it comes up with 1.961 is it's an approximation of BT.709. See the table: https://github.com/FFmpeg/FFmpeg/blob/3372876888db8bc8dd27350549654d11d5bb40a6/libavutil/csp.c#L135

If I had to guess, your metadata viewer is simply reporting the number in the gAMA tag without providing important information like "this will be ignored because sRGB, cICP, or iCCP are present in the file."

Your metadata viewer may also simply not understand cICP as it was a relatively recent addition to the PNG specification.

mightyhuhn commented 7 months ago

which is all fine and well.

but that doesn't change that this number is wrong. it's a gamma field not a approximate and yes an approximate is the closes to the real transfer for sRGB and such i get it but we have the real number...

bt 709 gamma is 2.22 there is also a camera transfer similar to sRGB (which is not used for video) the old spec for video is 2.2 with 2.4 for dark rooms. then 2.4 and now BT.1886 which is still 2.4 only the display according to is own capability will change that.

if you throw a bt 2.4 image at a bt.1886 calibrated display no matter the CR you will get accurate bt.1886. if i take that 1.961 and correct it to my display i get garbage.

mightyhuhn commented 7 months ago

ok this is getting really out of hand: image

this is player just guessing? https://www.color.org/chardata/rgb/bt601.xalter

this dvd of 2006 has bt 1886 gamma which is from 2011 using a bt 601 flag.

ghost commented 7 months ago

BT.1886 is the correct EOTF for BT.601, BT.709 and BT.2020

mightyhuhn commented 7 months ago

because you can use this eotf it doesn't mean it is the one it was mastered at... and because bt709 and bt601 had the same EOTF at one point doesn't mean when you change the bt709 eotf that you change the eotf of bt601.

here is the spec: https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.1886-0-201103-I!!PDF-E.pdf where is 1.961 in there?

there is this in there an OETF not an EOTF.

V = 1.099 L0.45 – 0.099 for 1 ≥ L ≥ 0.018 V = 4.500 L for 0.018 > L ≥ 0 where: L: luminance of the image 0 ≤ L ≤ 1 V: corresponding electrical signal

edit: i finally found the BS number: https://en.wikipedia.org/wiki/Rec._709#Transfer_characteristics Rec. 709 OETF is as follows, close to 1/1.9 – 1/2.0 pure gamma:[11]

and that is used as an EOTF...

ghost commented 7 months ago

because you can use this eotf it doesn't mean it is the one it was mastered at...

The transfer characteristics for the BT.601 spec are the same transfer characteristics as BT.709. Anything else is either historic or outside of spec. If they share the same transfer characteristics, it only makes sense to decode both of them with BT.1886.

https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.709-6-201506-I!!PDF-E.pdf https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.601-7-201103-I!!PDF-E.pdf https://www.itu.int/rec/T-REC-H.273-202107-S

edit: i finally found the BS number: https://en.wikipedia.org/wiki/Rec._709#Transfer_characteristics Rec. 709 OETF is as follows, close to 1/1.9 – 1/2.0 pure gamma:[11]

and that is used as an EOTF...

As mentioned before in this thread by @Traneptora , the 1.961 value is a simple gamma curve approximation of the Rec. 709 encoding curve, which is what is used for decoding the video. This is essentially what BT.1886 is. It just describes a reference curve for decoding content so that it looks as close as possible to the encoded source (which in this case means you're trying to get as close as possible to the film grade as captured by the camera).

Anyways, I'm not really sure what any of this has to do with the original issue. If you want mpv to convert pngs to sRGB like with vo_gpu (and many other video players), the option already exists.

mightyhuhn commented 7 months ago

because you can use this eotf it doesn't mean it is the one it was mastered at...

The transfer characteristics for the BT.601 spec are the same transfer characteristics as BT.709. Anything else is either historic or outside of spec. If they share the same transfer characteristics, it only makes sense to decode both of them with BT.1886.

which is wrong because they don't have a EOTF they have the same OETF which is completely irrelevant. bt 709 got an official EOTF with bt 1886. befoer that it was the wild west and later calibrator defined 2.4 for a Batcave. https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.709-6-201506-I!!PDF-E.pdf https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.601-7-201103-I!!PDF-E.pdf https://www.itu.int/rec/T-REC-H.273-202107-S

edit: i finally found the BS number: https://en.wikipedia.org/wiki/Rec._709#Transfer_characteristics Rec. 709 OETF is as follows, close to 1/1.9 – 1/2.0 pure gamma:[11] and that is used as an EOTF...

As mentioned before in this thread by @Traneptora , the 1.961 value is a simple gamma curve approximation of the Rec. 709 encoding curve, which is what is used for decoding the video. This is essentially what BT.1886 is. It just describes a reference curve for decoding content so that it looks as close as possible to the encoded source (which in this case means you're trying to get as close as possible to the film grade as captured by the camera).

the OETF is not how a video is encoded it is how a camera signal is to be interpreted (optical to electrical we are already electrical) it can also be linear this alone should tell you to ignore it. which is for video playback irrelevant because we need a EOTF not a OETF. no gamma is used for decoding an video file image the decoded image expect you to have a certain gamma at your display device which is not 1.961 in any spec and with bt.1886 2.4 on an perfect black device and low on others but both get the same image of 2.4. or in other words you decode you convert, upscale chroma to 444 and use the proper YCbCr-RGB conversation done gamma is at this point not a variable. Anyways, I'm not really sure what any of this has to do with the original issue. If you want mpv to convert pngs to sRGB like with vo_gpu (and many other video players), the option already exists.

no videoplay other than mpv converts gamma to sRGB to my knowledge and no videoplayer will create a png and writes a gamma of 1.961 in there because the difference between OETF and EOTF is completely lost.

the original issue is 1.961 because it does not exist. it is not a video gamma it is a camera gamma please understand the difference. mastering studio master using an EOTF not an OETF.

i'm really lost here to that i have to say that SDR video uses pure gamma curve usually 2.4 and bt 1886 is 2.4 you can proof that yourself what does the math say when you have perfect black.

Traneptora commented 7 months ago

The PNG specification recommends that PNG writers write gAMA chunks even if it writes other color chunks that override it as fallback metadata to poorly color managed PNG viewers. The actual file is tagged correctly with cICP chunk, which overrides the fallback gAMA chunk.

It's also worth mentioning that 1.961 (and also, 2.2) are not encoded in the file. The reciprocal of these numbers is encoded, i.e. when your metadata viewer reports "gamma = 2.2" it's actually reading a gAMA chunk with 0.45455.

You seem so fixated on the gAMA chunk saying it's "wrong" when the spec literally tells you to write it anyway even if there's other color chunks that override it.

mightyhuhn commented 7 months ago

you write a OETF into a EOTF field the gAMA and say it is fine because we also write the cICP where you write the EOTF in and not OETF which is bt1886 which is 2.4.

or with other words: can't read the cICP 2.4 please use 1.961. which is just wrong.

1.961 does not exist it's made up it. let me repeat yet again it is Opto-electronic transfer where is the light sensor to get anything opto? it's not even that it is an approximation of it while video uses pure gamma. so yes please write the gAMA but don't write made up stuff in it there is no camera involved in here.

Traneptora commented 7 months ago

. you write a OETF into a EOTF field the gAMA and say it is fine

Because the spec tells you to write 45455, not 2.2. If you think this is wrong you should take it up with the PNG specification.

because we also write the cICP

Indeed, the PNG specification tells you that cICP overrides gAMA.

which is bt1886 which is 2.4.

No, neither BT.709 nor BT.1886 are pure gamma curves, and can't be described by a single gamma value.

or with other words: can't read the cICP

Then you need to either find a PNG viewer that can (I hear chromium does! as does ffmpeg/mpv). Alternatively, you could use --screenshot-tag-colorspace=no which will coerce the image into sRGB before writing the PNG.

kasper93 commented 4 months ago

I'll try to keep this brief, as enough has been said on this topic.

The BT.709 OETF has an approximate gamma of ~1.961, but when displaying it, no one actually uses the inverse OETF because BT.1886 exists. However there is one bad apple... Apple, which continues to use 1.961 for historical reasons, even after BT.1886 was standardized.

Typically, BT.1886 should be approximated by 2.4 or 2.2 when considering typical non-perfect black environments.

If the gAMA chunk is used as a fallback when an image viewer does not understand cICP, it should produce similar results as an image viewer that understands cICP, simple as.

We need to be pragmatic here. I don't think 1.961 is correct except in the context of Apple. In DaVinci Resolve, Rec 709-A was specifically introduced to conform to Apple's behavior, it is separate option. I don't think we should force this in lavc by default either. Instead, we should conform to what the rest of the world is doing and use the BT.1886 gamma approximation instead of the BT.709 inverse OETF.

This issue isn't critical enough to gain much traction, but it surfaces in waves. It even has a nice round number on FFmpeg's trac https://trac.ffmpeg.org/ticket/10000.

I believe FFmpeg should use the 2.2 approximation for BT.1886 because it will never be exact, but it will also be the best fallback value that reflects real BT.1886 on typical hardware. It's also the value that libplacebo currently uses as an approximation.

diff --git a/libavutil/csp.c b/libavutil/csp.c
index 7ef822c60b..c2a98cc4f2 100644
--- a/libavutil/csp.c
+++ b/libavutil/csp.c
@@ -133,12 +133,12 @@ enum AVColorPrimaries av_csp_primaries_id_from_desc(const AVColorPrimariesDesc *
 }

 static const double approximate_gamma[AVCOL_TRC_NB] = {
-    [AVCOL_TRC_BT709] = 1.961,
-    [AVCOL_TRC_SMPTE170M] = 1.961,
-    [AVCOL_TRC_SMPTE240M] = 1.961,
-    [AVCOL_TRC_BT1361_ECG] = 1.961,
-    [AVCOL_TRC_BT2020_10] = 1.961,
-    [AVCOL_TRC_BT2020_12] = 1.961,
+    [AVCOL_TRC_BT709] = 2.2,
+    [AVCOL_TRC_SMPTE170M] = 2.2,
+    [AVCOL_TRC_SMPTE240M] = 2.2,
+    [AVCOL_TRC_BT1361_ECG] = 2.2,
+    [AVCOL_TRC_BT2020_10] = 2.2,
+    [AVCOL_TRC_BT2020_12] = 2.2,
     [AVCOL_TRC_GAMMA22] = 2.2,
     [AVCOL_TRC_IEC61966_2_1] = 2.2,
     [AVCOL_TRC_GAMMA28] = 2.8,

I'm reopening this, because I don't think the conclusion was satisfying for any party and maybe it will never be resolved, but we can track any progress here. But if someone starts spamming here, I won't hesitate to nuke this issue.