Add scalable linear light texture for shaders

agyild commented 2 years ago

Certain scaling filters are designed to operate in non-gamma (i.e. linear) color space. Manually trying to calculate linear light in shader code interferes with mpv's color conversions and might break things. I want to be able to request a scalable linear texture from mpv, process it, and give it back to mpv for necessary color conversions.

Alternatively, if you can suggest me a proper shader code that won't break things, I could use that. I tried implementing mpv's sRGB to linear and vice-versa conversions in shader code, and it mostly worked fine, but I have encountered some cases where it causes artifacts. If I hook and process MAIN with the code below, I get artifacts. I don't understand much of the color theory, but I presume it is because MAIN is not sRGB but RGB. Is that correct?

vec3 srgb_to_linear(vec3 col) {
    return mix(col * 1.0 / 12.92,  pow((col + 0.055) / 1.055, vec3(2.4)), ivec3(lessThan(vec3(0.04045), col)));
}

vec3 linear_to_srgb(vec3 col) {
    return mix(col * 12.92, 1.055 * pow(col, vec3(1.0 / 2.4)) - 0.055, ivec3(lessThanEqual(vec3(0.0031308), col)));
}

igv commented 2 years ago

Certain scaling filters are designed to operate in non-gamma (i.e. linear) color space.

FSR is not one of them. It's designed to upscale images that haven't been downscaled (in linear space in this case). Otherwise gamma light is preferred.

haasn commented 2 years ago

Shader math seems fine to me. But you probably want ivec not bvec.

I can certainly think about making LINEAR resizable. It doesn't seem too difficult.

agyild commented 2 years ago

presume it is because MAIN is not sRGB but RGB

That does not make sense. sRGB is R'G'B', not sYCC, so does not have a matrix. Also you should clip right here to from 0 to 1, because YCbCr can have outside 0-1 float, unless you want to do sYCC to HDR. Would be nice.

BT.709 limited range matrix, x is from 16 to 235, y z are from 16 to 240.

Thank you for your reply. However, since I have a very basic understanding of color theory, I don't know what to make of your reply. I am not hooking NATIVE nor any of the raw source planes, instead I am hooking MAIN so I assume it should give me a full range color texture.

haasn commented 2 years ago

Actually, what type of artifacts do you get? Something I just noticed is that mix, especially with ivec (instead of bvec), can happily carry over NaNs even from the branch not chosen.

So yes, you do what to make sure you clamp the incoming values. Try adding col = max(col, 0.0); to your functions. Incidentally, I'm not sure how much sense it makes to assume sRGB here since most content is not sRGB. You probably want something more like return pow(max(col, 0.0), vec3(2.4)); (and vice versa for 1.0/2.4).

agyild commented 2 years ago

Actually, what type of artifacts do you get? Something I just noticed is that mix, especially with ivec (instead of bvec), can happily carry over NaNs even from the branch not chosen.

So yes, you do what to make sure you clamp the incoming values. Try adding col = max(col, 0.0); to your functions. Incidentally, I'm not sure how much sense it makes to assume sRGB here since most content is not sRGB. You probably want something more like return pow(max(col, 0.0), vec3(2.4)); (and vice versa for 1.0/2.4).

I am currently working on porting FidelityFX CAS and FSR to mpv. And yeah, sRGB assumption comes from the fact those filters are designed for video games which use sRGB textures most of the time. But when it comes to video content, you can no longer assume a source colorspace. That's why I don't really want to assume anything, and do any in-shader conversions because it's not as simple as in video games. As a best practice, mpv should take care of all the color conversion stuff.

The manual states the following:

MAIN (resizable) The main image, after conversion to RGB but before upscaling.

May I learn what kind of RGB is this? Is this already linear? If so what's difference between MAIN and LINEAR? Because I believe the gamma curves are applied at a later stage (and therefore RGB becomes sRGB or whatever the target medium is), right? Sorry, like I said my understanding of color theory is pretty limited.

agyild commented 2 years ago

Yeah, I am just not going to try doing it in-shader. Because even if I did then it is not going to work with HDR content, then I will have to create another version for HDR which is a code smell and bad practice. I am surprised LINEAR and SIGMOID is not resizable already, hopefully those features get added soon.

haasn commented 2 years ago

That's why I don't really want to assume anything, and do any in-shader conversions because it's not as simple as in video games. As a best practice, mpv should take care of all the color conversion stuff.

I appreciate the sentiment but does it really matter? as long as it's vaguely linear it should be good enough to do the job, right? These are subjective shaders after all. As long as you exactly inverse it on the other end it still ends up decoding to the right thing. (Though I can definitely see that logic breaking down for PQ-encoded content)

May I learn what kind of RGB is this? Is this already linear?

No, it absolutely isn't. (*) It depends on the source tagging. So you are definitely correct in needing to linearize it yourself if you want linear light content.

If it really matters, for the time being, you could make both a HDR and SDR version of the shader (with PQ and pure power gamma respectively) and choosing the right one with an auto-profile or something. Though I will warn you that scaling HDR things in linear light will probably lead to very bad results. It might be best to just throw a gamma 2.4 curve on the content regardless of the source transfer.

But I definitely agree that LINEAR can be made resizable. There's no really good reason not to. The only reason it isn't, is because the decision to linearize or not is done as a result of the scaling direction. So we lock down the scaling parameters before we know if there even is a linear hook or not.

But for libplacebo at least I can change this a bit more easily, to make it also force a linear path (independent of scaling) if there is any shader hook on it. Though that does raise the question of what should happen if you end up transforming the texture in such a way that it no longer wants linearization, and especially e.g. for HDR scaling.

(*) Unless you're opening a linear light floating point EXR image or something...

agyild commented 2 years ago

@haasn What about an alternative? Instead of playing with the pipeline, can you make certain preprocessor macros available about the nature of the source content? So I can read which gamma curve to apply/inverse, if the source is SDR/HDR etc. I think that's a better practice in the long term since it has more applications besides just this single use case.

haasn commented 2 years ago

@agyild Yeah that's a good solution, actually.

I pushed something like this to libplacebo master which should work for vo_gpu_next. I'll try seeing if I can port it to vo_gpu if you need it.

igv commented 2 years ago

Release you FSR already. Gamma or linear it doesn't matter anyway. Just hook MAIN. You will add the linearize macro later.

agyild commented 2 years ago

As for the time being until the source colorspace preprocessor macros become available, I have decided to use the following the color conversion algorithms from AMD's ffx_a.h library, with Rec709 being the default. It can be disabled or changed via a preprocessor variable.

vec3 From709(vec3 c) {
    return max(min(c * vec3(1.0 / 4.5), vec3(0.081)), pow((c + vec3(0.099)) * (vec3(1.0) / (vec3(1.099))), vec3(1.0 / 0.45)));
}

vec3 FromPq(vec3 x) {
    vec3 p = pow(x, vec3(0.0126833));
    return pow(clamp(p - vec3(0.835938), 0.0, 1.0) / (vec3(18.8516) - vec3(18.6875) * p), vec3(6.27739));
}

vec3 FromSrgb(vec3 c) {
    return max(min(c * vec3(1.0 / 12.92), vec3(0.04045)), pow((c + vec3(0.055)) * (vec3(1.0) / vec3(1.055)), vec3(2.4)));
}

vec3 To709(vec3 c) {
    return max(min(c * vec3(4.5), vec3(0.018)), vec3(1.099) * pow(c, vec3(0.45)) - vec3(0.099));
}

vec3 ToPq(vec3 x) {
    vec3 p = pow(x, vec3(0.159302));
    return pow((vec3(0.835938) + vec3(18.8516) * p) / (vec3(1.0) + vec3(18.6875) * p), vec3(78.8438));
}

vec3 ToSrgb(vec3 c) {
    return max(min(c * vec3(12.92), vec3(0.0031308)), vec3(1.055) * pow(c, vec3(0.41666)) - vec3(0.055));
}

In this case PQ has the same bug as the one in in the previous sRGB implementation, where there is this color crush artifact when black and white edges appear next to each other such as in a movie credits. Other than that, how do they look? Any thoughts? @haasn And yes, the artifacts in the previous sRGB to linear implementation were probably caused because of how mix() operates in the previous implementation, since this min(), max() based implementation don't have such artifacts. I have switched to assuming Rec709 from sRGB, since mpv is mostly used for video playback with Rec709 primaries. It's not the best solution, but it's better than creating multiple files.

Release you FSR already. Gamma or linear it doesn't matter anyway. Just hook MAIN. You will add the linearize macro later.

CAS with scaling is nearly finished. I will start FSR after this one. Shouldn't take long.

Edit: So the color crushing bug in PQ was apparently due to not clamping the normalized input color values during the relinearization pass. Apparently, it is needed because that is also how mpv does it. So, the problem seems to be solved for now.

agyild commented 2 years ago

I have just re-released CAS along with scaling support, and further AMD optimizations. FYI.

igv commented 2 years ago

(floor(HOOKED_pos target_size) + 0.5) (input_size / target_size) - 0.5;

Should be HOOKED_pos * input_size - 0.5 + tex_offset.

agyild commented 2 years ago

So I am in the process of porting FSR, and it is nearly finished. However, I have two questions: FSR requires an additional sharpening pass on the upscaled image. Considering the first pass uses hooks MAIN and upscales it to OUTPUT, what should I hook for the second sharpening pass if I want it immediately to be applied on the first pass' output? And my second question: I am using the following WHEN directive in the first pass to make it only run if scaling factor is between 1.0 and 4.0: OUTPUT.w OUTPUT.h * MAIN.w MAIN.h * / 4.0 > ! OUTPUT.w OUTPUT.h * MAIN.w MAIN.h * / 1.0 > *. However, if I also use this directive in the second pass, it no longer runs since I assume MAIN is now at the size of the OUTPUT. Is there a way to reach prescaled size of the MAIN in the second pass' WHEN directive?

@igv @haasn

igv commented 2 years ago

FSR requires an additional sharpening pass on the upscaled image.

It's basically a dumbed down version of CAS with optional denoising (which you can port into CAS).

second question

Just port only EASU part of FSR, and use it like this: glsl-shaders="\~~/EASU.glsl;~~/CAS.glsl". If you want CAS (hook MAIN in this case) be applied only when upscaling is needed - save output of EASU to temporary texture and then bind it to CAS (this pass will be skipped if that texture is unavailable).

agyild commented 2 years ago

FSR has been ported as well. FYI.

igv commented 2 years ago

I recommend to enable denoise by default.

//!SAVE MAIN

Redundant.

agyild commented 2 years ago

//!SAVE MAIN

Redundant.

Thanks. It was a leftover from debugging code.

I recommend to enable denoise by default.

From my testing, visually I did not notice any positive benefits besides reducing the effect. But that might be just me. Did you test it?

igv commented 2 years ago

Yes.

agyild commented 2 years ago

@igv Do you think that porting FSR and CAS as compute shaders as they are originally intended would improve performance? I have never programmed computer shaders before, but made some research and people say that they are better suitable for image processing algorithms such as sharpening, resizing etc.. And if I wanted to create their compute shader versions, how would I proceed? I imagine it's just some syntax and texture coordinate modifications, right? Thanks for assisting me by the way.

igv commented 2 years ago

Do you think that porting FSR and CAS as compute shaders as they are originally intended would improve performance?

No, these shaders are very simple, there is not much to optimize with CS (you can think of them as a low level PS).

And if I wanted to create their compute shader versions, how would I proceed? I imagine it's just some syntax and texture coordinate modifications, right?

Take a look at this version of FSRCNN, where I merged all the "expansion" and "sub-pixel convolution" passes + "aggregation" into 1 "reconstruction" pass (compare to PS version).

haasn commented 2 years ago

They should be somewhat faster as compute shaders if you use local shared memory to share data between adjacent pixels, which is basically what mpv does to speed up normal convolutions (e.g. ewa_lanczos).

igv commented 2 years ago

Kernel size is too small, speed up will be like 5% maybe.

agyild commented 2 years ago

NVIDIA Image Scaling has also been ported as well.

agyild commented 2 years ago

@haasn According to ITU-R BT.709, colors should be linearized before luma calculation. Does that mean LUMA texture is already linearized?

haasn commented 2 years ago

@haasn According to ITU-R BT.709, colors should be linearized before luma calculation. Does that mean LUMA texture is already linearized?

BT.709 only describes how to encode video. To do the opposite (e.g. what mpv needs) it's implicitly assumed that we're reversing the order of operations from the spec. That means we decode YCbCr first, and then linearize. (Compare to first delinearizing, then encoding into YCbCr which is what BT.709 specifies)

And no, LUMA is always non-linear - even for color spaces in which the order of operations is different (e.g. BT.2020-CL). LUMA literally contains the encoded source texture, bit-for-bit.

mpv-player / mpv

Add scalable linear light texture for shaders #9818