Closed hauuau closed 10 months ago
Thank you for your excellent work! To be honest, reviewing this PR is beyond my ability, and I think it would be better to make it a separate project (i.e., a fork of mpv-prescalers), and then contribute the generated effect files to this repository. Please note that the ongoing #643 will change the shader syntax again.
Considering that this PR is very complex, the review will take a long time, but I am busy implementing #643. If you insist on contributing to this repository, please be patient, and I will come back after 1.0 is released, including adapting to the new syntax and fixing bugs.
It may indeed be much easier to review and maintain changes to generation scripts in a separate repository as a fork of mpv-prescalers.
I have created it here and made necessary commits to bring it to the same state as this pull request.
Generation scripts for these shaders are in the source-magpie
branch. This is a comparison between upstream and that branch: diff
Generated shaders for Magpie are in the magpie
branch. This is a comparison between generated mpv and magpie versions: diff
I'll mark this pull request as a draft as this feels more appropriate in this case.
Do you have any preferences in how I should format pull requests for generated effects (single commit, separate commits, separate pull requests)?
I'll take a look at #643 and will make necessary changes. I'm fine with deferring this to after that gets merged.
If you have any requests for other changes I should make to generation scripts to make generated effects more fit to inclusion, I'll happily consider implementing them.
I'll investigate the bug in the original RAVU-Zoom further and will try contacting upstream about it.
Do you have any preferences in how I should format pull requests for generated effects (single commit, separate commits, separate pull requests)?
Please open a new pull request and copy the generated effect files to the RAVU and NNEDI3 folders.
I noticed that the generated effect files are all under LGPL. Is it possible to re-license them as GPL, just like the other effects in Magpie?
I have switched license of generated shaders to GPL3+ as relicensing from LGPL3+ to GPL3+ is allowed according to the GPL Compatibility Matrix.
I'll submit separate pull requests for both v3 and v4 generated shaders.
Awesome! Let's switch over to the new PRs, I think this one can be closed.
Sure.
As soon as I lower that 0.5 coefficient to something like 0.4 the bug goes away. Might it be a rounding bug? I wonder why the upstream doesn't seem to be affected by it.
Could you verify that setting the coefficient to 0.49999 fixes the issue? If so, it can be confirmed as a float rounding issue.
AFAICT, it's very difficult to trigger such floating rounding issue. First, it has to be perfect odd integer scaling as a starter, such as 1x or 3x upscaling, to have texel centers somehow matches before and after ravu-zoom. And considering group size is set to 32x8
, 3x or 5x upscaling won't hit the issue as well. So, overall, this bug likely will only be triggered if the scaling factor is exactly 1. And in mpv, I have a condition check to bypass ravu-zoom shader in such case //!WHEN HOOKED.w OUTPUT.w < HOOKED.h OUTPUT.h < *
.
Second, the shader compiler has to do some kind of optimization so that the following two statements was handled differently for the same ivec2 ID
. In mpv, all coordinates floats are configured to use high precision IIRC.
ivec2 rectr = ivec2(floor(HOOKED_size * HOOKED_map(ID) - 0.5)) + ...;
vec2 pos = HOOKED_size * HOOKED_map(ID); ivec2 ipos = ivec2(floor(pos)) - rectl
This is just a quick look so I might made some mistakes.
@hauuau Could you confirm that:
#pragma optionNV(fastmath off)
or #pragma optionNV(fastprecision off)
for Nvidia cards?
- Setting the number to 0.49999 fixes the issue?
Setting coefficient in the rectr
calculation to 0.49999 or 0.499999 seems to fix the issue. It comes back at 0.4999999.
- Magpie applies ravu-zoom even on 1x scaling factor?
- This issue can only be triggered with 1x scaling factor?
Magpie allows applying ravu-zoom and other arbitrary scaling shaders on 1x scaling but 1x doesn't seem to trigger this issue for me. For me it's usually triggered with 3x scaling (1280x720 → 3840x2160). The issue manifests itself either as random noise on a grid of block size or at least with 32x8 block size it's more often a horizontal 1px line of random noise 8px from the top of the screen.
- try turning some kind of optimization of with pragma like #pragma optionNV(fastmath off) or #pragma optionNV(fastprecision off) for Nvidia cards?
I'm not on NVidia hardware so optionNV
pragmas do nothing for me and I have failed to find similar ones for AMD. I'm not sure if they work on HLSL or not. I'm very new to this stuff. And they are not mentioned in Microsoft's HLSL documentation.
HLSL has precise
keyword to turn off optimizations during calculations of designated variables but it seems to have quite weird effect on this bug. If I declare rectr
as precise
then the bug goes away but if I try declaring both rectr
and rectl
as precise
then it comes back. Maybe I need to try poking more variables with this keyword but I haven't tried further.
@hauuau I tried 3x upscaling with mpv ravu-zoom just now, and somehow I'm still unable to reproduce. Guess it highly depends on shader compiler then.
Anyway, thanks for your tests. Able to verify that it's a float rounding issue is great.
@bjin
Even on HLSL with Magpie it seems to depend a lot on how different variables in those calculations are declared (even without precise
keyword). Sometimes it seems to go away if temporary variable is inserted somewhere in this calculation.
I failed to reproduce it on mpv too.
Should lowering coefficient in the rectr
calculation be considered a proper fix in this case?
Should lowering coefficient in the
rectr
calculation be considered a proper fix in this case?
Yes, rectr
is only used to pre-fetch sampling area, so it's safe to be a bit larger, as long as it's smaller than the size of shared samples[]
. And samples[]
are already declared slightly larger even for 1x scaling ((H+6)*(W+6)
instead of (H+5)*(W+5)
).
@bjin Thank you very much! This helped me a lot.
I can reliably reproduce this error at odd integer scaling (1x, 3x, 5x, etc.), where the calculation result of floor(HOOKED_size * HOOKED_map(group_end) - 0.5)
becomes unstable due to floating-point error. Changing 0.5 to 0.49 might be a better solution, as it overcomes the error effect and has almost no performance loss.
I was debugging some other issue, but found interesting insight how (gather) sampling location is computed. [1] In short, sampling location is converted to at least 16.8 fixed point and while in this case it is gather specific, the same apply to linear scaling [2]. There is also in [1] interlude about nearest filtering.
And sure with linear filtering it shouldn't cause a problem, but I found it interesting detail which can produce strange results, in specific cases. So I felt like sharing it also here, if anything as a curiosity. Maybe for GPU experts this is obvious knowledge, but for me it was interesting :)
[1] https://www.reedbeta.com/blog/texture-gathers-and-coordinate-precision/ [2] https://microsoft.github.io/DirectX-Specs/d3d/archive/D3D11_3_FunctionalSpec.htm#7.18.8%20Linear%20Sample%20Addressing
Pixels and polygons and shaders, oh my!
This is a port of scripts by @bjin which are used to generate shaders in the mpv-prescalers repository.
I'd like some comments about viability of this approach.
I tried to keep things as close to the upstream as reasonably possible to make it easier to backport potential upstream changes in the future.
Performance doesn't seem to be different from already included versions but I haven't conducted much performance testing. It seems it could be possible to get some performance gains from playing with the
--compute-shader-block-size
option though but they're probably hardware specific.I have tested generated shaders and they are functional. These include the new versions of RAVU-Zoom and RAVU-Lite with Anti-ringing built in. RGB versions are working too.
RAVU-Zoom generated by these scripts doesn't seem to suffer from the bug #554.
But I have unfortunately encountered something very similar to #516 during development and I'm utterly confused by it. If I do something like
then the bug manifests itself. But if I do something like
then the bug goes away for some reason. That seems like some kind of compiler bug to me but I haven't investigated further.
That bug might be somehow related to either of these lines in RAVU-Zoom:
That's the line affected by difference in definition of HOOKED_size.
Or it might be this one:
As soon as I lower that 0.5 coefficient to something like 0.4 the bug goes away. Might it be a rounding bug? I wonder why the upstream doesn't seem to be affected by it.
It's my first experience dealing with both GLSL and HLSL so I bet there are some mistakes or pitfalls I have fallen into.