Blinue / Magpie

An all-purpose window upscaler for Windows 10/11.
GNU General Public License v3.0
9.66k stars 496 forks source link

[RFC] A port of generation scripts from the mpv-prescalers project #803

Closed hauuau closed 10 months ago

hauuau commented 10 months ago

This is a port of scripts by @bjin which are used to generate shaders in the mpv-prescalers repository.

I'd like some comments about viability of this approach.

I tried to keep things as close to the upstream as reasonably possible to make it easier to backport potential upstream changes in the future.

Performance doesn't seem to be different from already included versions but I haven't conducted much performance testing. It seems it could be possible to get some performance gains from playing with the --compute-shader-block-size option though but they're probably hardware specific.

I have tested generated shaders and they are functional. These include the new versions of RAVU-Zoom and RAVU-Lite with Anti-ringing built in. RGB versions are working too.

RAVU-Zoom generated by these scripts doesn't seem to suffer from the bug #554.

But I have unfortunately encountered something very similar to #516 during development and I'm utterly confused by it. If I do something like

#define HOOKED_size float2(GetInputSize())

then the bug manifests itself. But if I do something like

static const float2 HOOKED_size = float2(GetInputSize());

then the bug goes away for some reason. That seems like some kind of compiler bug to me but I haven't investigated further.

That bug might be somehow related to either of these lines in RAVU-Zoom:

vec2 pos = HOOKED_size * HOOKED_map(ivec2(gl_GlobalInvocationID));

That's the line affected by difference in definition of HOOKED_size.

Or it might be this one:

ivec2 rectr = ivec2(floor(HOOKED_size * HOOKED_map(group_end) - 0.5)) + %d;

As soon as I lower that 0.5 coefficient to something like 0.4 the bug goes away. Might it be a rounding bug? I wonder why the upstream doesn't seem to be affected by it.

It's my first experience dealing with both GLSL and HLSL so I bet there are some mistakes or pitfalls I have fallen into.

Blinue commented 10 months ago

Thank you for your excellent work! To be honest, reviewing this PR is beyond my ability, and I think it would be better to make it a separate project (i.e., a fork of mpv-prescalers), and then contribute the generated effect files to this repository. Please note that the ongoing #643 will change the shader syntax again.

516 is a bug that existed in the original version of RAVU, caused by insufficient sampling area of the original texture. It's very complex to implement effects that support arbitrary scaling factors with compute shaders.

Blinue commented 10 months ago

Considering that this PR is very complex, the review will take a long time, but I am busy implementing #643. If you insist on contributing to this repository, please be patient, and I will come back after 1.0 is released, including adapting to the new syntax and fixing bugs.

hauuau commented 10 months ago

It may indeed be much easier to review and maintain changes to generation scripts in a separate repository as a fork of mpv-prescalers. I have created it here and made necessary commits to bring it to the same state as this pull request. Generation scripts for these shaders are in the source-magpie branch. This is a comparison between upstream and that branch: diff

Generated shaders for Magpie are in the magpie branch. This is a comparison between generated mpv and magpie versions: diff

I'll mark this pull request as a draft as this feels more appropriate in this case.

Do you have any preferences in how I should format pull requests for generated effects (single commit, separate commits, separate pull requests)?

I'll take a look at #643 and will make necessary changes. I'm fine with deferring this to after that gets merged.

If you have any requests for other changes I should make to generation scripts to make generated effects more fit to inclusion, I'll happily consider implementing them.

I'll investigate the bug in the original RAVU-Zoom further and will try contacting upstream about it.

Blinue commented 10 months ago

Do you have any preferences in how I should format pull requests for generated effects (single commit, separate commits, separate pull requests)?

Please open a new pull request and copy the generated effect files to the RAVU and NNEDI3 folders.

I noticed that the generated effect files are all under LGPL. Is it possible to re-license them as GPL, just like the other effects in Magpie?

hauuau commented 10 months ago

I have switched license of generated shaders to GPL3+ as relicensing from LGPL3+ to GPL3+ is allowed according to the GPL Compatibility Matrix.

I'll submit separate pull requests for both v3 and v4 generated shaders.

Blinue commented 10 months ago

Awesome! Let's switch over to the new PRs, I think this one can be closed.

hauuau commented 10 months ago

Sure.

bjin commented 10 months ago

As soon as I lower that 0.5 coefficient to something like 0.4 the bug goes away. Might it be a rounding bug? I wonder why the upstream doesn't seem to be affected by it.

Could you verify that setting the coefficient to 0.49999 fixes the issue? If so, it can be confirmed as a float rounding issue.

AFAICT, it's very difficult to trigger such floating rounding issue. First, it has to be perfect odd integer scaling as a starter, such as 1x or 3x upscaling, to have texel centers somehow matches before and after ravu-zoom. And considering group size is set to 32x8, 3x or 5x upscaling won't hit the issue as well. So, overall, this bug likely will only be triggered if the scaling factor is exactly 1. And in mpv, I have a condition check to bypass ravu-zoom shader in such case //!WHEN HOOKED.w OUTPUT.w < HOOKED.h OUTPUT.h < *.

Second, the shader compiler has to do some kind of optimization so that the following two statements was handled differently for the same ivec2 ID. In mpv, all coordinates floats are configured to use high precision IIRC.

ivec2 rectr = ivec2(floor(HOOKED_size * HOOKED_map(ID) - 0.5)) + ...;

vec2 pos = HOOKED_size * HOOKED_map(ID); ivec2 ipos = ivec2(floor(pos)) - rectl

This is just a quick look so I might made some mistakes.

@hauuau Could you confirm that:

  1. Setting the number to 0.49999 fixes the issue?
  2. Magpie applies ravu-zoom even on 1x scaling factor?
  3. This issue can only be triggered with 1x scaling factor?
  4. I don't know HLSL well, but could you try turning some kind of optimization of with pragma like #pragma optionNV(fastmath off) or #pragma optionNV(fastprecision off) for Nvidia cards?
hauuau commented 10 months ago
  1. Setting the number to 0.49999 fixes the issue?

Setting coefficient in the rectr calculation to 0.49999 or 0.499999 seems to fix the issue. It comes back at 0.4999999.

  1. Magpie applies ravu-zoom even on 1x scaling factor?
  2. This issue can only be triggered with 1x scaling factor?

Magpie allows applying ravu-zoom and other arbitrary scaling shaders on 1x scaling but 1x doesn't seem to trigger this issue for me. For me it's usually triggered with 3x scaling (1280x720 → 3840x2160). The issue manifests itself either as random noise on a grid of block size or at least with 32x8 block size it's more often a horizontal 1px line of random noise 8px from the top of the screen.

  1. try turning some kind of optimization of with pragma like #pragma optionNV(fastmath off) or #pragma optionNV(fastprecision off) for Nvidia cards?

I'm not on NVidia hardware so optionNV pragmas do nothing for me and I have failed to find similar ones for AMD. I'm not sure if they work on HLSL or not. I'm very new to this stuff. And they are not mentioned in Microsoft's HLSL documentation.

HLSL has precise keyword to turn off optimizations during calculations of designated variables but it seems to have quite weird effect on this bug. If I declare rectr as precise then the bug goes away but if I try declaring both rectr and rectl as precise then it comes back. Maybe I need to try poking more variables with this keyword but I haven't tried further.

bjin commented 10 months ago

@hauuau I tried 3x upscaling with mpv ravu-zoom just now, and somehow I'm still unable to reproduce. Guess it highly depends on shader compiler then.

Anyway, thanks for your tests. Able to verify that it's a float rounding issue is great.

hauuau commented 10 months ago

@bjin Even on HLSL with Magpie it seems to depend a lot on how different variables in those calculations are declared (even without precise keyword). Sometimes it seems to go away if temporary variable is inserted somewhere in this calculation. I failed to reproduce it on mpv too.

Should lowering coefficient in the rectr calculation be considered a proper fix in this case?

bjin commented 10 months ago

Should lowering coefficient in the rectr calculation be considered a proper fix in this case?

Yes, rectr is only used to pre-fetch sampling area, so it's safe to be a bit larger, as long as it's smaller than the size of shared samples[]. And samples[] are already declared slightly larger even for 1x scaling ((H+6)*(W+6) instead of (H+5)*(W+5)).

hauuau commented 10 months ago

@bjin Thank you very much! This helped me a lot.

Blinue commented 10 months ago

I can reliably reproduce this error at odd integer scaling (1x, 3x, 5x, etc.), where the calculation result of floor(HOOKED_size * HOOKED_map(group_end) - 0.5) becomes unstable due to floating-point error. Changing 0.5 to 0.49 might be a better solution, as it overcomes the error effect and has almost no performance loss.

kasper93 commented 8 months ago

I was debugging some other issue, but found interesting insight how (gather) sampling location is computed. [1] In short, sampling location is converted to at least 16.8 fixed point and while in this case it is gather specific, the same apply to linear scaling [2]. There is also in [1] interlude about nearest filtering.

And sure with linear filtering it shouldn't cause a problem, but I found it interesting detail which can produce strange results, in specific cases. So I felt like sharing it also here, if anything as a curiosity. Maybe for GPU experts this is obvious knowledge, but for me it was interesting :)

[1] https://www.reedbeta.com/blog/texture-gathers-and-coordinate-precision/ [2] https://microsoft.github.io/DirectX-Specs/d3d/archive/D3D11_3_FunctionalSpec.htm#7.18.8%20Linear%20Sample%20Addressing

Texture Gathers and Coordinate Precision – Nathan Reed’s coding blog
Pixels and polygons and shaders, oh my!
Direct3D 11.3 Functional Specification