Closed NSQY closed 2 years ago
That actually looks like the subtitles are being blended onto the image with the wrong alpha value. If you look closely, you can see the lines from behind the text showing through.
Can reproduce/confirm.
Note @bloc97 that this Anime4K shader has a bug, it handles alpha incorrectly:
return HOOKED_tex(HOOKED_pos) - (current_luma - new_luma);
should be something like:
vec4 color = HOOKED_tex(HOOKED_pos);
color.rgb -= (current_luma - new_luma) * color.a;
return color;
This is not relevant here (the file has no alpha channel anyway), but it's still something that would blow up if you try loading Anime4K on any file with an alpha channel.
Yeah, I pretty much assumed that videos would have no alpha values, ever. I will see if there is a better way to support alpha without slowing down the clamping "hack" that I implemented.
Okay, so, this one is caused by an unfortunate difference in the way libplacebo and mpv handle subtitle blending onto video frames. Basically, in libplacebo, MAIN
was changed to be before blending subtitles, whereas in mpv, MAIN
is done after blending subtitles. So this works with --vo=gpu
because it sees the pre-blended subtitles as part of the video frame, but in libplacebo it does not. This is an unfortunate change that was mandated by a different in the internal architecture.
Possible work-arounds:
PREKERNEL
for the stats pass as well.MAINPOSTSUB
to both mpv and libplacebo as a work-around, and Anime4K could use it instead of MAIN
. Note, however, that this shader will still badly break if you combine it with e.g. --sigmoid-upscaling
or --linear-scaling
because by design it's trying to do operations on entirely different colorspaces! This is a clear no-go, and this hack doesn't really fix that much.I think the best course of action would be 3, as this shader only corrects for small highlights that gets over-brightened when upscaling (highlights have meaning in art, and should not be modified (un)willingly). Removing the shader has no other adverse consequence if you don't mind the unusual bright spots that appear rarely here and there...
1 would slow down the shader considerably when upscaling to 4K, this would be unacceptable for what basically should be a "no performance impact shader". 2 would work but it introduces so much work for little benefit. I am also thinking of deprecating this shader in the future in favour of a better upscaler architecture that can handle global style in upcoming versions that would not have this problem.
would slow down the shader considerably when upscaling to 4K, this would be unacceptable for what basically should be a "no performance impact shader".
Sounds like you need the ability to have hook priorities? So e.g. you can hook MAIN 0
for your stats pass, then hook MAIN 10
for your upscaling etc., then finally hook MAIN 99
for the clamp pass. (Or whatever numbers you make up)
I am also thinking of deprecating this shader in the future in favour of a better upscaler architecture that can handle global style in upcoming versions that would not have this problem.
I'm actually thinking of scrapping the entire glsl shader system in favor of something entirely new for --vo=gpu-next
, which would be built on top of libplacebo native APIs (pl_hook
, pl_shader_custom
etc.), and allow users to write shader modules as mpv plugins (C or Lua).
Not sure how performant it would be to have a lua callback run on every frame, but if we can pull it off that would be an interesting way forwards, I think.
Sounds like you need the ability to have hook priorities? So e.g. you can hook
MAIN 0
for your stats pass, then hookMAIN 10
for your upscaling etc., then finally hookMAIN 99
for the clamp pass. (Or whatever numbers you make up)
Yeah, it's basically the lack of granular control on which section of which shader should be run first, plus the fact that it's not feasible (to my knowledge) to pass uniforms across shaders. (Otherwise I would just pass the native resolution as an uniform and compute the stats on it.) Really what's causing issue here is the fact that Anime4K shaders are modular and there's no guarantee users won't use them in weird ways, I have to make sure the shader works as intended for anything else that might be put in the pipeline.
I'm actually thinking of scrapping the entire glsl shader system in favor of something entirely new for
--vo=gpu-next
, which would be built on top of libplacebo native APIs (pl_hook
,pl_shader_custom
etc.), and allow users to write shader modules as mpv plugins (C or Lua).Not sure how performant it would be to have a lua callback run on every frame, but if we can pull it off that would be an interesting way forwards, I think.
I would be a huge supporter of this, but considering it would take a significant amount of work, it should be planned ahead carefully. Am I correct in the assumption that this would allow people to write a plugin integrating mpv with ML inference libraries like ncnn? That would make mpv support all ML libraries (eg. TensorFlow, PyTorch) out of the box on most desktop/mobile GPUs/CPUs. Also the performance benefits that come with it...
Am I correct in the assumption that this would allow people to write a plugin integrating mpv with ML inference libraries like ncnn?
Maybe, but then only in C. Does ncnn support Vulkan?
ncnn only support vulkan as far as I know. My guess would be that it's much easier to support a wide range of GPUs for compute tasks on vulkan. There's also many other libraries for ML but I don't know if they are in C or C++, or sometimes how do they use both and whether can we only use the C part of the code...
The rough first plan would be to expose the ability to load any pl_hook
, and attach it like an ordinary user shader, to the mpv plugin API.
That way you could already, with current APIs, write your own C plugin that defines a pl_hook
which uses pl_vulkan_get
/pl_vulkan_unwrap
to unwrap the pl_tex
into their corresponding raw vulkan API objects. Then you could do do whatever ncnn requires to process these textures in whatever way you want, and give it back to the vo
as a new texture object.
Actually if you want to experiment with this currently, you can, the hard part will be wrapping the ncnn API into pl_hook
and you can start working on that inside the mpv code base already. (Just add the resulting hook to p->params.hooks
the same way any parsed user shader would)
Forgive me for asking, the ncnn library is written in C++ and the API is C++, would it be even possible to integrate it with mpv (which is written in C)? Or do I absolutely need to find a pure C library?
You can trivially write C plugins that use C++ APIs internally. libplacebo even does the same (for glslang).
I've thought more about the scripting project and I think I've come to the conclusion that I don't want to introduce a new scripting platform that depends somehow on libmpv. So the new thing, whatever it is, will be entirely done inside libplacebo. Otherwise it would fork the ecosystem into something where v2 shaders can only be used with mpv and vlc/ffmpeg get left out.
So my current plan is, actually, to integrate some sort of scripting language into libplacebo itself (e.g. lua or js). Then we can write our v2 shaders in those scripting languages directly.
The open question there would be, however, how to do that in a way that lets us use C plugins as well? (for ncnn) Should I add the ability to dlopen
files into libplacebo??
You wait until vo_gpu-next: subtitles seem to render at low res #9399 is fixed. I plan on changing subtitles. I think I'm ultimately leaning towards never blending subtitles onto video frames, period, and I will rewrite the overlay mechanism with that in mind.
Would this cause issues with fansub projects? .ass overlays for signs and the like.
Would this cause issues with fansub projects? .ass overlays for signs and the like.
Yes and no. Basically, here are the design constraints we're trying to satisfy simultaneously:
The issue with resolving all of these constraints simultaneously is that ASS gives us no way of distinguishing between signs and text (or even between dialogue text and scenery text). So we're forced to lump all subtitles together (dialogue, signs, etc.) and decide on a single place to apply them:
--blend-subtitles=video
)--blend-subtitles=yes
)--blend-subtitles=no
)Because these have different tradeoffs, this is exactly why mpv invented the --blend-subtitles
option. But in libplacebo I want to do better. The current compromise in libplacebo is actually not even on the above list: it satisfies 3 and 4 (partially) but violates 1, 2, 4 (sometimes) and 5.
After thinking about it more, I decided that constraint 1 is the least important tradeoff here - the downside is that users with very fancy shaders will not see softsubbed signs affected by those shaders, but since most shaders try and preserve colors anyway, it shouldn't be a huge deal. At best you might get a seam or two. I also thought about a way to implement all of these constraints (sans 1) simultaneously: render the subtitles at the end of the video, but interpolate the subtitles "manually". This is basically like --blend-subtitles=no
but without violating constraint 3.
In the process, it would also allow me to simplify and unify the subtitle handling code, and also fix issues with e.g. rotated videos. So I'm hugely in favor of doing it the way I have in mind now.
I also thought about a way to implement all of these constraints (sans 1) simultaneously: render the subtitles at the end of the video, but interpolate the subtitles "manually". This is basically like --blend-subtitles=no but without violating constraint 3.
This doesn't work :( No way to have interpolation and alpha-blending interact cleanly. At least not without an extra FBO indirection. Still debating what to do here.
Edit: I think I came up with a way around this problem - draw the overlays first and then alpha-blend the image onto it. That way I can use additive blending for the overlays and alpha blending for the image.
Edit 2: That still breaks if you have multiple regions from the same overlay intersecting... grr
What would be the implications of rendering the subtitles in a framebuffer, treating it independently from the video with interpolation (and even shaders) and the compositing it at the end? Is it bad for performance or some other limitation that I'm unaware of?
Edit: The only disadvantage of this method should be partially violating 1, as the softsubs will be rendered at screen resolution, but shaders that change style (i.e. color) could specify whether it should be run twice, once for the video and a second time for the subtitle. This is bad for performance but satisfies 2 3 4 5.
That would be the "extra FBO indirection" I mentioned in my previous post. It's not optimal for performance/VRAM. I'm not happy about it. :(
I see... Isn't there already a buffer for mpv's UI? Would it be possible to render subs on that? Edit: Nevermind, we wouldn't want to temporally interpolate the UI, or do we? (Back to square one)
No. There might be a buffer in RAM, but there might also be not, and anyway libplacebo has no way of knowing that.
One plan would be to use an indirection only to handle the case where two parts intersect. That way we can get away without the indirection for the hot path. (Another extremely exotic idea would be to use a compute shader to loop through the parts, perform the necessary intersection tests, and render the correctly blended result. But I'm not a huge fan of that because it's not very backwards-compatible, and also probably overkill given the relatively low cost of an FBO indirection anyway)
ncnn only support vulkan
Does this mean if one wants to use the new plugin integrating mpv with ncnn,can only chose gpu-api=vulkan
?
Does this mean if one wants to use the new plugin integrating mpv with ncnn,can only chose
gpu-api=vulkan
?
Very probably, yes. There is some theoretical interop but if you have interop support that means you have vulkan support so you might as well use it.
Arch
mpv-build
Unsure
Gnome 41
Nvidia proprietary 495.44
Reproduction steps
Shader file: https://github.com/bloc97/Anime4K/blob/master/glsl/Restore/Anime4K_Clamp_Highlights.glsl mpv --no-config --glsl-shader="Anime4K_Clamp_Highlights.glsl" --vo=gpu-next foo.mkv
Expected behavior
Subtitles to be displayed as --vo-gpu
Actual behavior
Subtitles are distorted (clamped?). I did not expect the shader to impact subtitles in this case.
Log file
mpv-log.txt