RavuxCFL variant testing

Jules-A commented 7 months ago

Since the discussion in previous issue isn't about the downscaler, probably should move the discussion to avoid spamming Artoriaz.

Tested the latest AR changes, it's better than previous but not better than the one without the 2 except it's slightly better on the Junji Ito sample or samples with ringing.

The one I posted in the previous issue may look good but it's pretty damn heavy, I also forgot there isn't any RGB versions of the LITE shaders. Maybe just doing CHROMA 2 * would be better anyway for speed reasons.

EDIT: Actually I think the CfL_Prediction_Ravu2 version may be better at high scaling factors as it looks notably better on 480 sample.

deus0ww commented 7 months ago

In case you didn't see the edit from other thread: You may want to extract the ravu-zoom-ar-r2 chroma pass from cfl+ravu2 and test that separately. Unlike the normal one that uses luma to upscale chroma, mine upscale both chroma planes independently without using luma at all (that's why it's slower).

The only thing left I want to try is adapting over ravu-zoom-r3... but that's a lot of work.

deus0ww commented 7 months ago

CfL+Ravu: https://github.com/deus0ww/mpv-conf/blob/master/shaders/cfl/CfL_Prediction_Ravu_R2.glsl https://github.com/deus0ww/mpv-conf/blob/master/shaders/cfl/CfL_Prediction_Ravu_R2X.glsl https://github.com/deus0ww/mpv-conf/blob/master/shaders/cfl/CfL_Prediction_Ravu_R3.glsl https://github.com/deus0ww/mpv-conf/blob/master/shaders/cfl/CfL_Prediction_Ravu_R3X.glsl

Ravu-zoom-ar Chroma: https://github.com/deus0ww/mpv-conf/blob/master/shaders/ravu/zoom/ravu-zoom-ar-r2-chroma.hook https://github.com/deus0ww/mpv-conf/blob/master/shaders/ravu/zoom/ravu-zoom-ar-r2x-chroma.hook https://github.com/deus0ww/mpv-conf/blob/master/shaders/ravu/zoom/ravu-zoom-ar-r3-chroma.hook https://github.com/deus0ww/mpv-conf/blob/master/shaders/ravu/zoom/ravu-zoom-ar-r3x-chroma.hook

R2/R3 - Updated old ravu-zoom-chroma with current AR code and LUT + gather. R2X/R3X - My variants that do not use luma for upscaling. Both chroma planes are scaled independently.

Personally, I'm using CFL+R3X for 4K->4K, and CFL+R2X for everything else.

Jules-A commented 7 months ago

Okay I finally got around to properly testing the CFL Hybrids and I can say that the X variants were a waste of time... They are pretty much worse in every way possible. The non-X variants were pretty good, however the R3's AR code was broken, after fixing that it was fine. R3 version was significantly better than R2. At first I thought it was the Ravu variants removing large chunks of outlines making the image look terrible, however it was your downscaling code, after replacing with my own, it no longer did that. If you're wondering what I meant: Before: before After: after

Performance.... Ughh... They were pretty damn slow lol but that's to be expected. R3 is ~6000 vs ~2500 vs my vesion for upscaling part for 1080p -> 6,400x3600 (for testing speed). R2 is around ~5200, at that point you might as well just use R3.

EDIT: Sorry if you read before edit, I haven't tested the ravu only shaders yet. EDIT2: Actually it looks like it wasn't the AR change I did that fixed the ringing but swapping out your DS with my own (i did both at the same time so I assumed it was that).

deus0ww commented 6 months ago

Downscaler: The content I watch doesn't contain a lot of black outlines (not anime...). We seem to be trading-off different kinds of artifact and unless I find another weight that doesn't fail the chroma subsampling pattern (the worst case scenario), I'm probably going to stick with this one. If you want to help test, try quadratic again but change the 1.5 in x = 1.5 * abs(d), in both passes. I have a feeling there's an optimal value somewhere between 1.0 - 2.0...

Upscaler: The X-variant, in theory, should perform better but the LUTs are obviously inappropriate for them. I'm now using R2/R3. If you still have performance headroom, try //!BIND LUMA instead for LUMA_LOWRES for the ravu pass.

Jules-A commented 6 months ago

Downscaler: The content I watch doesn't contain a lot of black outlines (not anime...). We seem to be trading-off different kinds of artifact and unless I find another weight that doesn't fail the chroma subsampling pattern

What about this as a middleground? https://raw.githubusercontent.com/Jules-A/glsl-chroma-from-luma-prediction/main/CfLP_Ravu_QD

It's more expensive speed-wise but overall not by that much. It scores better in metrics (including rtings sample) than my downscaler and it doesn't remove as much outlines as your previous versions though it still does a bit too much for my liking (since I'm using Ravu for luma which is already removing outlines, combining makes it a bit too much). EDIT: Normalising with jinc window keeps even more of borders/fixes some red issues, looks subjectively better on the rtings sample but scores lower :/ EDIT2: Using radius of 2 for quadratic when combined with jinc window seems to further improve things. EDIT3: Scaling to CHROMA 2 * seems with the above combination seems to work pretty well, though it's slightly better on rtings sample but slightly worse on every other without jinc. EDIT4: Only for larger factors ofc.

Upscaler: If you still have performance headroom, try //!BIND LUMA instead for LUMA_LOWRES for the ravu pass.

Doesn't help as much as I was expecting but the speed cost is massive... With the new downscaler variant I linked I had it do half luma instead.

deus0ww commented 6 months ago

Downscale:

I have given up on multi-stage downscale as all my attempts failed at 1.5x luma upscale (720p->1080p, 1440p->4k). I haven't tried this one but I predict it will, too, as texOff(0) does badly with fractional downscale.
Quadratic with x = 2.0 * abs(d) works. Doesn't fail but not as good on the test pattern as box. I'm still testing quality on real content. Note that this is a decrease in radius.
Why window it at all? All the weights we're testing naturally zero at some radius so might as well modify the weight function directly.

Upscale:

For me, //!BIND LUMA is only slightly slower for R2. For R3, it's significantly faster. I don't understand it at all. Probably dependent on driver/hardware. Quality wise, they're subjectively similar enough that I'm just picking the faster one i.e. R2 w/ LUMA_LOWRES and R3 w/ LUMA.

Jules-A commented 6 months ago

* I have given up on multi-stage downscale as all my attempts failed at 1.5x luma upscale (720p->1080p, 1440p->4k).  I haven't tried this one but I predict it will, too, as texOff(0) does badly with fractional downscale.

If you try my suggestion of scaling until CHROMA 2 * then it will force quadratic to do the fractional downscaling and texOff(0) will be a clean 2x assuming 4:2:0 input and doing luma upscaling.

Quadratic with x = 2.0 * abs(d) works

Umm.... I changed all values from 1.5 to 2.0, assuming that would increase the radius :/

Why window it at all? All the weights we're testing naturally zero at some radius so might as well modify the weight function directly.

I'm not great at maths... So it's often easier/quicker to try a whole heap of things than calculate the actual value

* For me, //!BIND LUMA is only slightly slower for R2.  For R3, it's significantly faster.

That doesn't make any sense... The cost is working with the larger resolution texture, at lower resolutions it's and I guess it's possible that the processing to be similar (maybe slower if the gather isn't working well) but going to 4k it SHOULD be slower...

Jules-A commented 6 months ago

I've given up on using Ravu for the entirety of Chroma upscaling and instead swapped back to using it to scale to output then using my modified version of CFL to Luma. It fixes a whole slew of bugs I was experiencing with CFL without thinning too much (from entirely using Ravu) while still benefiting from it's better red handling and anti-aliasing effect while retaining most of the sharpness of CFL. With your chroma dedicated version of Ravu and my replacement of Cfl's upscaler it's way better than the Ravu mix I had before.

Even scaling Chroma by as little as 1.5x before using CFL is enough to get rid of most CFL bugs in my experience so you can use it to scale to a resolution that will allow CFL to linearly scale afterwards.

This is similar to what I'm doing: https://raw.githubusercontent.com/Jules-A/glsl-chroma-from-luma-prediction/cflexperiments/RavuxCflpo.glsl although that assumes scaling luma > output res.

This is scaling Chroma 1.5x with Ravu though it assumes using a luma scaler (doesn't seem to apply with inbuilt scaling but I probably just messed up the RPN): https://raw.githubusercontent.com/Jules-A/glsl-chroma-from-luma-prediction/cflexperiments/RavuxCflp.glsl

Here's the 1.5x version tests vs CFL master with 2x ArtCNN_C4F32 Luma for comparison (difference is rather large):

RavuxCflp1.5x: MAE: 626.912 (0.00956607), PSNR: 33.4104 (0.334104), DSSIM: 0.138167 ![mpv-shot0002](https://github.com/deus0ww/mpv-conf/assets/1760158/42628d6b-d105-4b8b-be49-e8a38d9f600b) CFLP (Master): MAE: 643.078 (0.00981274), PSNR: 33.1649 (0.331649), DSSIM: 0.138946 ![mpv-shot0001](https://github.com/deus0ww/mpv-conf/assets/1760158/a6010a06-639d-43db-9118-cd4af35d90ab)

It does better than my previous non-ravu version in tests where there were visual glitches like in the test image but only slightly worse otherwise but still better than Cflp master.

deus0ww commented 6 months ago

For completeness, I added cfl+fsr_easu (and fsr_easu chroma). Different artifact tradeoffs from Ravu.

Jules-A commented 6 months ago

For completeness, I added cfl+fsr_easu (and fsr_easu chroma). Different artifact tradeoffs from Ravu.

I already tried doing it ages ago and it doesn't look like you've changed anything so I doubt it would be any different but honestly it's been so long I forgot what was wrong with it, just that ravu was significantly better overall. Honestly after fiddling around for so long with Chroma I'm really happy with the variant I'm using now, well at least quality wise, I just wish it was slightly faster as it only just fits in budget and opening other apps that use the GPU can cause a few frames to be delayed. I'm probably not going to fiddle around much more with Ravu variants, I could give FSR mix a shot but I'm not too excited. I think next I'll probably wait until the ArtCNN variants are a little more mature (currently I'm not seeing very good results when mixing with Luma scalers, not even with the ArtCNN luma scalers) before spending any more time testing.

deus0ww commented 6 months ago

Where did you get FSR for chroma?

NeilTohno commented 6 months ago

Well, I'll try it like this,

[high_quality2]
vo=gpu-next
gpu-api=opengl
gpu-context=win
opengl-swapinterval=0
profile=gpu-hq
fbo-format=rgba32f
vd-lavc-threads=16
#scale=ewa_lanczos
scale=ewa_lanczossharp
dscale=mitchell
#csale=ewa_lanczos
cscale=sinc
cscale-window=blackman
cscale-radius=3
glsl-shaders-append="~~/shaders/CfL_Prediction_FSR.glsl"
glsl-shaders-append="~~/shaders/AMD_FSR_nxt.glsl"
glsl-shaders-append="~~/shaders/AMD_FSR_rgb_nxt.glsl"

Jules-A commented 6 months ago

Where did you get FSR for chroma?

It was just an old version of the RGB version by agylid (forgot where I got it from) but only using it for chroma but I tried the newest rgb version and it didn't really change anything.

Well, I'll try it like this,

I have no idea what you are trying to do? Looks like you are scaling luma and chroma but then scaling both with an rgb shader. Though honestly I'm not even sure the last shader will even activate unless you remove the when conditions.

deus0ww commented 6 months ago

The old version of FSR that hooks to MAIN calculates luma from rgb, which would not work with YUV (obviously...). My variant scales both chroma planes directly without using/calculating luma.

Jules-A commented 6 months ago

The old version of FSR that hooks to MAIN calculates luma from rgb, which would not work with YUV (obviously...). My variant scales both chroma planes directly without using/calculating luma.

Oh right, I just double checked, it wasn't any of agylid's versions. Still not sure where I got it from though.

EDIT: I tested your version of EASU chroma (not the file in your repository since that's luma despite the name) and it does seem very slightly better overall than the version I was using previously but still doesn't seem to beat Ravu, however it's over 3x faster.

NeilTohno commented 6 months ago

glsl-shaders-append="~~/shaders/AMD_FSR_nxt.glsl"
glsl-shaders-append="~~/shaders/AMD_FSR_rgb_nxt.glsl"

(^_−)☆ I get it from here, a variant, https://github.com/hooke007/MPV_lazy/tree/main/portable_config/shaders

deus0ww commented 6 months ago

(^_−)☆ I get it from here, a variant, https://github.com/hooke007/MPV_lazy/tree/main/portable_config/shaders

From a brief look, the RGB version from there is based on an old version from agyild that calculates luma from RGB. Hooking that to chroma would not work as expected.

NeilTohno commented 6 months ago

From a brief look, the RGB version from there is based on an old version from agyild that calculates luma from RGB. Hooking that to chroma would not work as expected.

glsl-shaders-append="~~/shaders/CfL_Prediction_FSR.glsl"
glsl-shaders-append="~~/shaders/AMD_FSR_EASU_luma_nxt.glsl"
glsl-shaders-append="~~/shaders/AMD_CAS_scaled.glsl"
glsl-shaders-append="~~/shaders/AMD_FSR_EASU_rgb_nxt.glsl"
glsl-shaders-append="~~/shaders/AMD_CAS_scaled_rgb.glsl"

Thanks, I will test this.

Jules-A commented 6 months ago

From a brief look, the RGB version from there is based on an old version from agyild that calculates luma from RGB. Hooking that to chroma would not work as expected.

Actually that looks to be the version I was using and it is not based off agylid's version (at least the rgb versions aren't). It does work, however it's very slightly worse than your version.

Thanks, I will test this.

Running a tonne of shaders like that won't help and most likely make the image worse, although I'm betting most won't even activate anyway once you get to the target resolution. The RGB shaders aren't just chroma scalers (they scale the combined image with luma too). The CAS scaled shaders aren't just CAS, they are weird and try to scale the image at the same time, imo they are rather bad vs just using RCAS (if sharpening is needed).

NeilTohno commented 6 months ago

glsl-shaders-append="~~/shaders/CfL_Prediction_FSR.glsl"
glsl-shaders-append="~~/shaders/AMD_FSR_EASU_luma_nxt.glsl"
glsl-shaders-append="~~/shaders/AMD_CAS_scaled.glsl"
glsl-shaders-append="~~/shaders/AMD_FSR_EASU_rgb_nxt.glsl"
glsl-shaders-append="~~/shaders/AMD_CAS_scaled_rgb.glsl"

Yeah, I just need a new FSR shader with a CfL_Reduction patch, my bad.

Jules-A commented 3 months ago

2 new Ravu variants which use a new downscale tactic which involves Gaussian: 1.5x Ravu Chroma scaling before CFL: https://github.com/Jules-A/glsl-chroma-from-luma-prediction/blob/cflexperiments/CfLP_RavuCx1.5.glsl (requires 2x Luma scaling for Ravu, gets rid of most CFL errors) 2x Ravu Chroma scaling before CFL: https://github.com/Jules-A/glsl-chroma-from-luma-prediction/blob/cflexperiments/CfLP_RavuCx2.glsl (requires 2x Luma scaling though better suited for higher, no noticeable CFL errors but not as sharp)

If you get time I'd like to hear how it handles live-action content as I only tested 1 show where it easily surpassed CFL master.

deus0ww commented 3 months ago

Is the gaussian luma downscale the only difference from my CFL+RAVU? I'll try to test them when I have time.

Jules-A commented 3 months ago

Is the gaussian luma downscale the only difference from my CFL+RAVU? I'll try to test them when I have time.

No, for starters, it's not using Ravu for the full upscale, just using Ravu for the partial initial upscale allows it to avoid many or most of CFLs errors while still being able to take advantage of sharpness of CFL. I also changed the CFL upscaler for the x1.5 while using the one in master for x2, am still using hermite with FSR on the downscale before Gaussian and adjusted values such as mix_coeff, corr_exponent and AR.

deus0ww commented 3 months ago

Sorry, I'm going to be away from my desktop for a few days and looking at the code on my phone is challenging... Quick questions:

The luma-guided RAVU variants are not good (see here). Have you tried the non-luma-guided (R2X or R3X) variants?
How's the performance?

Jules-A commented 3 months ago

* The luma-guided RAVU variants are not good ([see here](https://github.com/bjin/mpv-prescalers/commit/da5d3c58827e6985035e5f919b52732f9653bfaa)). Have you tried the non-luma-guided (R2X or R3X) variants?

* How's the performance?

For full scaling, of course, but the bilateral chroma scalers have issues that cause clipping or discolouring with mostly reds when scaling from Chroma directly. I tried EASU but it removes too much detail compared to Ravu. I haven't tried guiding the Chroma from downscaled luma yet (like your shaders do) though I assume that would only improve performance. I tried the X variants earlier but they were worse in all scenarios I tested them but I didn't test them in live-action content.

The 1.5x shader consumes 4,400 when run with ArtCNN16 and other settings compared to 6,886 with your R3 (non-X) variant.

deus0ww commented 3 months ago

I'm using CFL+FSR, by default, because it's the fastest acceptable one. Anything else can become too slow with ArtCNN (luma), which has higher priority. Otherwise I would use CFL+RAVU R3X, which looks slightly better than CFL+FSR.
For the luma-guided RAVU, I'm indeed using LUMA_LOWRES for performance (subjectively no difference to using LUMA).
The problem with luma-guided RAVU variants is that they under perform in the same places where CfL also under performs (areas with low/no luma/chroma correlation). In these areas, even FSR performs better.

Jules-A commented 3 months ago

I'm using CFL+FSR, by default, because it's the fastest acceptable one. Anything else can become too slow with ArtCNN (luma), which has higher priority. Otherwise I would use CFL+RAVU R3X, which looks slightly better than CFL+FSR.

For the luma-guided RAVU, I'm indeed using LUMA_LOWRES for performance (subjectively no difference to using LUMA).

Ah, I tested using LUMA_LOWRES which lowered it from 4400 to 2500 and actually reduced the amount of errors but significantly increased aliasing. Turned out I forgot I was upscaling CFL with Chroma that was 1.5x Luma so I needed to scale LUMA to 1.5x first to match. End result is this: https://github.com/Jules-A/glsl-chroma-from-luma-prediction/blob/9ce5a25082f00d093de03ba668b8a05183a04ee5/CfLP_RavuCx1.5LR.glsl

which is around ~2300 and imo seems slightly better but haven't fully tested. Your FSR version is around ~1750 for comparison.

Obviously it's pretty messy like that so I need to rework the downscaler.

deus0ww / mpv-conf

RavuxCFL variant testing #28