bloc97 / Anime4K

A High-Quality Real Time Upscaler for Anime Video
https://bloc97.github.io/Anime4K/
MIT License
18k stars 1.34k forks source link

Shader seems to not work #118

Open Ranoiaetep opened 3 years ago

Ranoiaetep commented 3 years ago

To Reproduce Currently the config file is set as:

glsl-shaders="~~/shaders/Anime4K_Denoise_Bilateral_Mode.glsl:~~/shaders/Anime4K_Deblur_DoG.glsl:~~/shaders/Anime4K_DarkLines_HQ.glsl:~~/shaders/Anime4K_ThinLines_HQ.glsl:~~/shaders/Anime4K_Upscale_CNN_M_x2_Deblur.glsl"

Screenshots Screen Shot 2021-04-13 at 3 03 14 PM

Desktop (please complete the following information):

Is there anything I need to manually install beside mpv? I simply followed the Installation Instruction on the release page.


Tried to apply only a single shader, seems to result in similar problem.


Just checked, the v2.1a_L works totally fine for me, but v2.1a doesn't work.

bloc97 commented 3 years ago

This is weird, v2.1a_L and v2.1a are identical except the L version has a couple more CNN layers... Does enabling each of the five shaders individually cause problems? Have you also tried other mpv shaders? We have to narrow down the problem as I can see that the shaders are running in the screenshot, but something is causing illegal/wrong texture accesses, which causes all of the artifacts you see. Finally, are you using the v3.1 release version or the most current shaders in the repo?

Ranoiaetep commented 3 years ago

Thank you for quick reply.

Does enabling each of the five shaders individually cause problems?

Good suspect, seems like it was only the Anime4K_Denoise_Bilateral_Mode causing the problem here. The rest of them seems to work fine.


Finally, are you using the v3.1 release version or the most current shaders in the repo?

Yes, it was downloaded from your releases section.

v2.1a_L and v2.1a are identical except the L version has a couple more CNN layers...

Here's a screenshot for them:

Screen Shot 2021-04-13 at 4 26 03 PM Screen Shot 2021-04-13 at 4 25 38 PM

Ranoiaetep commented 3 years ago

Just checked, Denoise_Bilateral_Median and Denoise_Bilateral_Mean works fine as well.

bloc97 commented 3 years ago

This is even more strange, Bilateral_Median is more complex than Bilateral_Mode, except that a part of the code is disabled by default. What happens if you set #define HISTOGRAM_REGULARIZATION 0.0 in Bilateral_Median to something like #define HISTOGRAM_REGULARIZATION 0.3? Does the bug appear? If so I think the Radeon Pro 555 mobile GPU might have some kind of driver bug.

Ranoiaetep commented 3 years ago

#define HISTOGRAM_REGULARIZATION 0.3? Does the bug appear?

As you suspected, changing it to anything larger than 0.0 creates the bug.

bloc97 commented 3 years ago

What I think might be happening is that this part of the code causes the GPU wavefronts to go out of sync. This should not happen with modern GPUs as they all can handle this type of "branching" very well if properly unrolled during compilation.

for (int i=0; i<KERNELLEN; i++) {
    ...
    for (int j=(i+1); j<KERNELLEN; j++) {
        ...
    }
}

Here you see that int j = (i+1) can confuse the GPU threads if not properly compiled and unrolled, as the length of a loop in a GPU should not be variable.

Or it is simply because the regularization part uses 9 more registers (float histogram_wn[KERNELLEN];), which a lower-end GPU might not have enough. But in that case the GPU should slow down and not completely give up like in your screenshot.

As I do not have access to your GPU I'm afraid I cannot debug this unless mpv supports a full GPU profiler dump. (Which I'm not aware of.)

bloc97 commented 3 years ago

I have quickly modified the bilateral mode denoiser so it no longer uses the extra registers and also removed the nested for loop. This of course is bad for readability and theoretically performs worse, but might work for you. Feel free to try it when you see fit. Anime4K_Denoise_Bilateral_Mode_Modified.txt

Ranoiaetep commented 3 years ago

Thank you for your insights. I don't possess the hardwares to test them right now, I will test them tomorrow.

Something I did notice earlier, not sure if helpful:

Jules-A commented 3 years ago

Probably worth noting old AMD driver versions on Windows used to get a corrupt image (sort of similar) with this shader about half a year ago in DX but not in OGL/Vulkan.

bloc97 commented 3 years ago

That is what I thought too, as I saw people having this issue with older GPUs, but here the user has macOS, which should be using either OGL and Vulkan.

Jules-A commented 3 years ago

That is what I thought too, as I saw people having this issue with older GPUs, but here the user has macOS, which should be using either OGL and Vulkan.

I'm really kind of clueless when it comes to MACs but I thought OGL was deprecated and Vulkan can't be used directly but instead translated to Metal via MoltenVK? No idea what MPV does though.

Ranoiaetep commented 3 years ago

I have quickly modified the bilateral mode denoiser so it no longer uses the extra registers and also removed the nested for loop. This of course is bad for readability and theoretically performs worse, but might work for you. Feel free to try it when you see fit. Anime4K_Denoise_Bilateral_Mode_Modified.txt

Just tried your modified version, still doesn't seem to work.

However, I just tried manually write out the array as 9 separate variables, and expand the for loop manually, that seems to solve it. See as attached. Anime4K_Denoise_Bilateral_Mode.txt

hooke007 commented 3 years ago

I'm really kind of clueless when it comes to MACs but I thought OGL was deprecated and Vulkan can't be used directly but instead translated to Metal via MoltenVK? No idea what MPV does though.

https://github.com/mpv-player/mpv/pull/7482 For mac vk is still WIP, OGL is the default backend.