mpv-player / mpv

🎥 Command line video player
https://mpv.io
Other
28.06k stars 2.88k forks source link

gpu-next: ASS subtitles should not be assumed to be sRGB #13381

Closed arch1t3cht closed 6 months ago

arch1t3cht commented 8 months ago

First of all, I tagged this issue as gpu-next: since right now it mainly comes up on gpu-next, the core issue is a deeper one and also affects some configurations of gpu (see below).

Bug report formalities

Important Information

Provide following Information:

Reproduction steps

The attached zip file contains a video with a rectangle for every possible (limited) 8-bit brightness value (plus a few black frames at the start so it's easy to see that the subtitle file is indeed loaded and visible), and a subtitle file that draws a smaller square into each of those rectangles. The subtitle file was generated in Aegisub in such a way that the square in each rectangle matches the color of that rectangle. When viewing the subtitles on that video file inside of Aegisub, the colors of the squares match the colors of the rectangles.

Steps to reproduce: Open the given video (on an SDR display) with the given subtitles in mpv with --vo=gpu-next or a different --transfer-trc and seek past the initial black frames.

Expected behavior

Just like in Aegisub, the colors of the subtitle squares match the video's colors, so that the squares are not actually distinguishable on screen.

Actual behavior

Screenshot (done with mpv's screenshot command; screenshots of the actual window look similar) of mpv --no-config brightness_test.mkv --sub-file=brightness_test.ass --screenshot-format=png --start=2 --pause --vo=gpu-next

mpv-shot0001

One can see that, in the dark areas (especially the top right corner), the squares are visible, i.e. their colors do not exactly match the video.

Compare this to the same command on --vo=gpu: mpv --no-config brightness_test.mkv --sub-file=brightness_test.ass --screenshot-format=png --start=2 --pause --vo=gpu

mpv-shot0002

Here, the squares are almost completely invisible, except for some very subtle differences due to dithering.

However, gpu-next is not the only place where colors differ: For example, the following happens with mpv --no-config brightness_test.mkv --sub-file=brightness_test.ass --screenshot-format=png --start=2 --pause --vo=gpu --target-trc=bt.1886.

mpv-shot0003

More generally, any value of target-trc other than auto causes a mismatch.

Note also that with --vo=gpu-next, this behavior is independent of the value of --blend-subtitles, even though the documentation states that --blend-subtitles=yes "makes the subtitles be treated as being in the video's color space".

Log file

output_gpu.txt output_gpu_next.txt

Sample files

brightness_test.zip

Discussion

This is a fairly complex topic so I'll detail everything that I believe to be relevant for this below. There were also some past discussions about similar questions, in particular this issue at libass.

But why is gpu-next wrong here?

mpv assumes that subtitles are sRGB both in gpu and in gpu-next:

https://github.com/mpv-player/mpv/blob/57d6826249dee3f9ea26937fb58ba2b8e2d3eae8/video/out/gpu/video.c#L2901-L2911

https://github.com/mpv-player/mpv/blob/57d6826249dee3f9ea26937fb58ba2b8e2d3eae8/video/out/vo_gpu_next.c#L316-L319

The only reason why the colors match in gpu is that gpu doesn't do conversions between SDR transfers when target-trc is not set:

https://github.com/mpv-player/mpv/blob/57d6826249dee3f9ea26937fb58ba2b8e2d3eae8/video/out/gpu/video.c#L2621-L2627

(This is probably intended to some degree, but I'd argue that it is quite surprising behavior that setting target-trc from auto to bt.1886 on a normal SDR display affects subtitle colors but not video colors)

wm4 also said that "Subtitles are sRGB" in a related issue at libass, and this can be assumed to be the case for subtitle formats like SRT. But for ASS subtitles it's more complicated.

To my knowledge, the color space of ASS subtitles is not specified anywhere (just like almost everything else about ASS subtitles). The only thing that's specified is the YCbCr Matrix wrangling, but that only concerns R'G'B' -> Y'CbCr -> R'G'B' conversion (though it exists for a reason related to this issue).

So, what color space should ASS subtitles have? Well, in the absence of an authoritative specification the reference for ASS rendering is VSFilter. Classic VSFilter drew its subtitles straight onto the Y'CbCr video stream. The Y'CbCr colors it used were computed from the given RGB triples via the BT.601 matrix (and the fact that that matrix was originally hardcoded led to the YCbCr Matrix header being created later). Equivalently, one can say that the subtitles were blended directly onto the video's R'G'B' colors (after converting the video from Y'CbCr to R'G'B' via its color matrix) without doing any gamma correction to either the subtitles or the video. Newer renderers, including authoring tools like Aegisub, then matched this behavior, and needless to say many existing subtitle files rely on this behavior (and note that wanting to exactly match the video's colors is a common use case for ASS subtitles, since subtitle authors want to create translations for on-screen text that blend in seamlessly).

So, with this justification, one can say that ASS subtitles should not be sRGB, and instead be in whatever color space the video is, i.e. that no color correction should be applied to ASS subtitles. (I also brought up this question on the libass IRC before opening this issue, where this was confirmed by the maintainers.)

But it gets more complicated

Unfortunately, it's clear that while the philosophy of "ASS subtitles should be in whatever color space the video is in" works for ordinary SDR BT.601/BT.709 video (possibly also for BT.2020 though things may already get tricky there), this can clearly cause problems for HDR. While sometimes subtitle authors want to exactly match the video's colors, often they also just want to create some simple color - the most common case being white dialogue subtitles. Taking an average subtitle file with 255,255,255 white subtitles and applying them onto an HDR video in the video's color space would put that white at 10,000 nits, which is clearly not desirable by anybody. It's also not possible to say "Well, people that want to author subtitles for HDR content should account for that and use darker colors for their white" since libass is also used to render various other subtitle formats like SRT via ffmpeg auto-conversion - whether on the fly or not. Also, this philosphy would lock subtitle files to a single video (in a case where there exist SDR and HDR versions of the same video) even when the file does not actually contain any complex typesetting that needs to match video colors. Finally, it would also make subtitle style overrides harder, since different video color spaces would need different style overrides.

And what about primaries?

More or less the same issues can be raised about differing primaries, i.e. on how to play ASS subtitles on content with BT.2020 or DCI-P3 content, although they're less critical here. Still, a subtitle author that wants to color their subtitles in red may be surprised when that red suddenly shows up as much deeper color because the subtitles were suddenly assumed to have BT.2020 primaries by the player, but typesetters will still want to be able to match the video's BT.2020 colors (so in particular it needs to be possible to draw BT.2020 colors at all).

So, what should we do?

The above was a more or less objective description of all the factors involved in deciding how to interpret ASS colors. The following will be my suggestions on how to fix (or at least improve) the current logic, but naturally it's a bit more opinionated.

First of all, the issues with HDR notwithstanding, one can still safely say that for SDR content and primaries like BT.709 and BT.601, ASS subtitles should be taken to be in the video's color space. This would cover the vast majority of files that are broken with the current handling, and it's what I'd recommend to do as an immediate "fix" for this issue.

For HDR files or bigger gamuts, it's clear that there is no simple solution covering all use cases. For now this is not a big issue since (to my knowledge) next to no subtitle files exist that aim to do typesetting on HDR/WCG content, but this might change in the future. There, a solution is needed that allows for easily matching video colors without the problems mentioned above.

Spelling it out more explicitly, there are two cases that need to be covered:

In pretty much all cases I can think of from the perspective of a subtitle author, this distinction between typesetting and dialogue matches the distinctions made in two other subjects:

The question of what color space to blend subtitles in aligns very closely with these two questions. Thus, a possible future solution to these three problems could be to implement such a typesetting/dialogue split and to then split subtitle rendering into two phases:

This would be a comparatively simple (as opposed to something like allowing arbitrary color spaces to be specified per style or per event) solution that still covers the vast majority of use cases. (Though it does make some things harder or impossible: It would no longer be possible to draw something above the dialogue subtitles in the video's colors (think of a timing meme where dialogue is supposed to disappear behind an object in the scene). This would already be harder with a rendering resolution split, but a color space split would make it impossible to work correctly in all cases. But this feels like a necessary trade-off.)

This solution would sidestep the issue of tonemapping and any dependencies on whether the viewer has an HDR or SDR display, and it'd require fairly little cooperation from subtitle authors or authoring tools: Authoring tools do not strictly need to apply any gamma correction and could just pretend that all video is BT.709 or sRGB (of course it would be nice if they properly supported HDR and bigger color spaces, but they wouldn't need to to output working files), and subtitle authors do not need to manually map colors as long as they correctly tag their subtitles as dialogue or typesetting. The only problem is that this first requires cooperation from the subtitle renderer, so this can only be implemented once support for a typesetting/dialogue distinction is made possible in libass.

So, for now this remains a hypothetical solution. Still, I wanted to lay out all my thoughts here because they may help with decisions on how the more short-term fixes should look. Again, my main recommendation for now is to blend subtitles in the video's color space when the video is BT.709/BT.601. (For HDR, assuming them to be sRGB is probably fine for now.)

Finally, a note about --blend-subtitles: I'm aware that --blend-subtitles=yes already blends subtitles in the video's color space on gpu (but it doesn't on gpu-next and I believe that this is a bug). I guess with that in mind the issue is that --blend-subtitles=no should not be the default for ASS subtitles on SDR video. Of course it does not make sense to change the default value in such a conditional way. (I guess an additional --blend=subtitles=auto value could be added and made the default, but that has its own issues.) I don't really know of a simple way to make the effect on --blend-subtitles on this behavior make more sense, and I don't know enough how color spaces are/should be handled for non-ASS subtitles, but I wanted to bring up the fact that the behavior of this option is also involved here.

kasper93 commented 8 months ago

Seems to have been introduced in https://github.com/mpv-player/mpv/commit/704afb89681bc6d7fb87cce0a389444de3fac2a0, though I didn't confirm that myself yet and the issue is more nuanced than that

I think this is good tl;dr of this whole situation and why I didn't really have much motivation to go through the mud of it. It is late I will not go through everything you said, but just few remarks.

Before commit 704afb89681bc6d7fb87cce0a389444de3fac2a0 every subtitle were rendered in the target color space, not video one, your display one. I wasn't even interested in ASS in this commit, I was fixing PGS subtitle rendering and copied the sRGB logic from vo_gpu, which in lack of authoritative documentation was the closest thing that "should" work correctly. Because apparently it has been tested for years.

To be honest to not dwell to much into it, current status quo is that everything is bt.709, both video and your display. That's why it silently works for seemingly everyone.

The main issue with "just use video colorspace duh" is that it works now, because like said above everything is bt.709, but I assure you no one wants to create HDR subtitles. But since there is no HDR anime, the problem didn't arise yet. Frankly it is shortsighted from libass devs that the subtitle colorspace is still not defined. I don't want to make this thread about what should be and fixing the world. I will just patch mpv to whatever you guys want and leave others with the unsolved problem.

Note that "just use video colorspace" also makes subtitles tied to specific video, one cannot use SDR subtitles on HDR video (or even on WCG), because... they are not tagged. And I don't know how you roll in typesetting community, but having to redo all subtitles is not the best approach. Sure there will be color difference and if you need perfect accuracy make a new subs, but if slight offset is acceptable just use existing ones. (also good luck creating subtitles for DV P5 and alike)

There are currently two main issues:

Note that up until recently libplacebo didn't even apply ICC profile to subtitles, so everything looked also different.

Traneptora commented 8 months ago

It's worth mentioning that ASS itself has a display-matrix tag. I don't know if avcodec exports it. I also don't know if it supports other colorspace tags.

ghost commented 8 months ago

The main issue with "just use video colorspace duh" is that it works now, because like said above everything is bt.709, but I assure you no one wants to create HDR subtitles. But since there is no HDR anime, the problem didn't arise yet.

Unfortunately we already have faced this problem as there is a number of HDR anime at this point. The decision people have taken is to just forego typesetting entirely, but this is not a compromise that can last as typesetting is essential to fansubbing.

Note that "just use video colorspace" also makes subtitles tied to specific video, one cannot use SDR subtitles on HDR video (or even on WCG), because... they are not tagged. And I don't know how you roll in typesetting community, but having to redo all subtitles is not the best approach. Sure there will be color difference and if you need perfect accuracy make a new subs, but if slight offset is acceptable just use existing ones. (also good luck creating subtitles for DV P5 and alike)

The process of typesetting generally requires an exact match for masking purposes so assuming the subtitle is in the same colorspace as the video makes the most sense, however as already noted this presents a problem for HDR sources and dialogue. Having to redo typesetting for sources with different colorimetry has been the status quo for years anyway.

haasn commented 8 months ago

I largely agree here that the problem is the lack of metadata on the ASS side. That should be addressed first. The established practice in all other domains of video is that things should never be assumed to be HDR/WCG unless explicitly tagged as such.

In any case, something we could do to address this in the short term would be to introduce a new option to control how subtitles colorspaces are handled, e.g.: --subtitle-colorspace=<auto/safe/srgb/bt709/bt601/...>

It may also be needed to treat PGS metadata separately, see https://github.com/mpv-player/mpv/pull/12405

arch1t3cht commented 8 months ago

It's worth mentioning that ASS itself has a display-matrix tag. I don't know if avcodec exports it. I also don't know if it supports other colorspace tags.

The YCbCr Matrix: header in ASS files is mostly unrelated to this, since it concerns color matrix mangling to be applied to the R'G'B' colors right after getting the images from the renderer. It was added in order to be backwards compatible to existing files for old VSFilter that assumed a BT.601 matrix (even on BT.709 video) but still allow new files to use other matrices (or opt out of color matrix wrangling entirely via 'YCbCr Matrix: None`). For more information, read the long explanatory comment in the corresponding libass header.

Other headers for transfer characteristics or primaries do not exist at the moment, and as described above adding them would not solve all relevant use cases. (And in the hypothetical typesetting/dialogue split solution I described above, it would only complicate the situation further with fairly little gain. That's why I described this proposal even if it cannot be implemented at the moment, since it may help in understanding what other actions would be helpful or unhelpful.)

As for kasper93's comment, I agree with wiwaz's response. Typesetting is bound to a specific videos already - in extreme cases even differences in encoding settings not related to color (e.g. bitrate or debanding/denoising) can affect how well typesetting blends in. A few more responses:

Frankly it is shortsighted from libass devs that the subtitle colorspace is still not defined.

So far, libass has mostly avoided giving detailed specifications and refers to VSFilter as the effective reference (which would clearly imply that ASS subtitles should always be in the video's color space. The main problem here is not that the colorspace is not properly defined, it's that the existing definition is not feasible for HDR video.). Recently the ASS File Format Guide outlining some best practices was created, and after I brought up the color space issue on IRC it was proposed to add the conclusion of that conversation to that page.

I don't want to make this thread about what should be and fixing the world. I will just patch mpv to whatever you guys want and leave others with the unsolved problem.

So, to spell it out once more, the immediate solution at the moment should be to use the video's color space (and either apply the exact same tone/gamut mapping to subtitle colors or to blend subtitles before tone/gamut mapping[^alpha]) for ASS subtitles when the video is SDR and has primaries like BT.709 or BT.601.

I do not know enough about other subtitle formats like PGS to say what should be done there. As for ASS subtitles on HDR, assuming them to be sRGB or BT.1886 probably works for most current cases, but maybe there should be an option to blend in the video's color space instead (i.e. making --blend-subtitles=yes work for gpu-next).

[^alpha]: Note that these two approaches are not completely equivalent. They have the same effect on the subtitle colors, but subtitles with alpha will behave differently (unless alpha is also mapped in some way). The question of whether to alpha-blend in linear or gamma light is whole other can of worms (though, again, VSFilter compatibility would dictate that it should be done in gamma, even if linear light would be more visually correct) but it's less pressing than the question of color spaces.

kasper93 commented 8 months ago

The established practice in all other domains of video is that things should never be assumed to be HDR/WCG unless explicitly tagged as such.

Exactly. But in the same time, I don't like the idea of mpv making a precedence for ASS and deciding something like that. We actually do it now, but frankly maybe if we don't we wouldn't have pressure on mpv to "fix" anything.

In any case, something we could do to address this in the short term would be to introduce a new option to control how subtitles colorspaces are handled

While this would work, I would rather have it currently all subtitles are assumed to be in video colorspace and like said before leave the problem to solve by the people in typesetting community. There are a lot of issues with this approach, but I don't feel like it is an mpv issue if upstream format doesn't care about color matching in practice.

The main issue of providing an option and different behaviors is that this invites different ASS to exist and what we should aim is to unify and have one approach.

As for kasper93's comment, I agree with wiwaz's response. Typesetting is bound to a specific videos already - in extreme cases even differences in encoding settings not related to color (e.g. bitrate or debanding/denoising) can affect how well typesetting blends in.

Sure, if you never want it work independent of video. One could imagine their subtitles to have proper colors regardless of the video they are playing with. But anyway, I'm not here to argue that. I don't feel it is mpv issue to solve.

So, to spell it out once more, the immediate solution at the moment should be to use the video's color space (and either apply the exact same tone/gamut mapping to subtitle colors or to blend subtitles before tone/gamut mapping1) for ASS subtitles when the video is SDR and has primaries like BT.709 or BT.601.

Yes, I will try to force myself to work on this soon(TM) subtitles in video colorspace with the same processing. Should make everyone "happy".

witchymary commented 8 months ago

Sure, if you never want it work independent of video. One could imagine their subtitles to have proper colors regardless of the video they are playing with. But anyway, I'm not here to argue that. I don't feel it is mpv issue to solve.

This may be true for simpler subtitles, sure, but it's certainly not for "high-level" typesetting. The goal is to match and "blend in" with the video as much as possible, making the subs—to an extent—kinda dependent on the source video used. It's a wanted behavior.

kasper93 commented 8 months ago

I will leak relevant log:

\<haasn> I still want to live in a world in which heavy typesetting is just encoded as a second video layer

witchymary commented 8 months ago

This is something that is also argued among fansubbers, lol. With that said, I do like to maintain the typesetting as reusable and tweakable as possible, which any method of hardsubbing doesn't easily allow.

Also, ASS typesetting has evolved quite a bit since the 2000s! Worth taking a look at some newer fansub releases sometime.

kasper93 commented 8 months ago

Here is my dump of changes that I had locally for some time. I need to remind myself what cases were remained to be fixed, IIRC HDR in general.

libplacebo: https://code.videolan.org/videolan/libplacebo/-/merge_requests/630 mpv: https://github.com/mpv-player/mpv/pull/13388

EDIT: If you want to help, testing is appreciated.

mightyhuhn commented 8 months ago

for SDR this has been fixed ages ago. cyberbeing is the person to ask about this. the actual issue was always aegisub and that it was always using bt 601 for the video not reading the video matrix. but the ass subtitles are RGB or in short you use the video matric and be done is older cases.

newer releases have proper matrix and aegisub also supports bt 709 now.

how srgb got into this is beyond me. video isn't sRGB and even if the ass sub are sRGB they are not rendered in sRGB.

and HDR as a reminder: https://github.com/libass/libass/issues/297

arch1t3cht commented 8 months ago

Once again, this is a completely different issue from the color matrix one (which, by the way, is also said in the issue you linked).

the actual issue was always aegisub and that it was always using bt 601 for the video not reading the video matrix.

This is not really true. Forcing decoding with BT.601 was a deliberate decision on Aegisub's part in order to match VSFilter's hardcoded BT.601 matrix for converting ASS colors into Y'CbCr. It's true that that option is still on by default in the latest official Aegisub release (but newer forks removed it a long time ago) but even then this would not cause color mismatches for subtitles whose colors match in Aegisub, since Aegisub would then also insert the respective YCbCr Matrix header.

mightyhuhn commented 8 months ago

but aegi also writes the meta data where the issue comes from. again this was fixed by cyberbeing ages ago.

your issue is that libplacebo isn't merging the subtitles before applying changes to the image in terms of gamma and such mismatching them in the process with HDR that get's quite complicated with SDR it should be no issue.

cubicibo commented 8 months ago

It may also be needed to treat PGS metadata separately, see https://github.com/mpv-player/mpv/pull/12405

HDMV PGS does not have any relevant metadata. The Blu-ray Disc Association documents says that the palette 8-bit YCrCb values must be scaled by 4 if the video is HDR10. Nothing else. The actual YCrCb values are down to the authorer doing things right. Wrong colours and brightness with PGS is generally an authoring error (generated with SDR white at (235, 128, 128)) or missmatched PGS/video assets.

kasper93 commented 6 months ago

Should be better now, after https://github.com/mpv-player/mpv/commit/9325ebe8173823fec00bf630780f5d8cf656b2b5 and https://github.com/mpv-player/mpv/commit/cbe30f614d878c2622e86d9c5cd8da558ef16936. Feel free to reopen if you want something else to be fixed. The remaining issues with HDR has to be resolved upstream first, see https://github.com/libass/libass/issues/297. Remaining libplacebo issue is tracked in https://github.com/mpv-player/mpv/issues/12610 and require this https://code.videolan.org/videolan/libplacebo/-/merge_requests/630 to be merged.