Open Ranoiaetep opened 3 years ago
This is weird, v2.1a_L and v2.1a are identical except the L version has a couple more CNN layers... Does enabling each of the five shaders individually cause problems? Have you also tried other mpv shaders? We have to narrow down the problem as I can see that the shaders are running in the screenshot, but something is causing illegal/wrong texture accesses, which causes all of the artifacts you see. Finally, are you using the v3.1 release version or the most current shaders in the repo?
Thank you for quick reply.
Does enabling each of the five shaders individually cause problems?
Good suspect, seems like it was only the Anime4K_Denoise_Bilateral_Mode
causing the problem here. The rest of them seems to work fine.
Finally, are you using the v3.1 release version or the most current shaders in the repo?
Yes, it was downloaded from your releases section.
v2.1a_L and v2.1a are identical except the L version has a couple more CNN layers...
Here's a screenshot for them:
Just checked, Denoise_Bilateral_Median
and Denoise_Bilateral_Mean
works fine as well.
This is even more strange, Bilateral_Median is more complex than Bilateral_Mode, except that a part of the code is disabled by default. What happens if you set #define HISTOGRAM_REGULARIZATION 0.0
in Bilateral_Median to something like #define HISTOGRAM_REGULARIZATION 0.3
? Does the bug appear? If so I think the Radeon Pro 555 mobile GPU might have some kind of driver bug.
#define HISTOGRAM_REGULARIZATION 0.3
? Does the bug appear?
As you suspected, changing it to anything larger than 0.0 creates the bug.
What I think might be happening is that this part of the code causes the GPU wavefronts to go out of sync. This should not happen with modern GPUs as they all can handle this type of "branching" very well if properly unrolled during compilation.
for (int i=0; i<KERNELLEN; i++) {
...
for (int j=(i+1); j<KERNELLEN; j++) {
...
}
}
Here you see that int j = (i+1)
can confuse the GPU threads if not properly compiled and unrolled, as the length of a loop in a GPU should not be variable.
Or it is simply because the regularization part uses 9 more registers (float histogram_wn[KERNELLEN];
), which a lower-end GPU might not have enough. But in that case the GPU should slow down and not completely give up like in your screenshot.
As I do not have access to your GPU I'm afraid I cannot debug this unless mpv supports a full GPU profiler dump. (Which I'm not aware of.)
I have quickly modified the bilateral mode denoiser so it no longer uses the extra registers and also removed the nested for loop. This of course is bad for readability and theoretically performs worse, but might work for you. Feel free to try it when you see fit. Anime4K_Denoise_Bilateral_Mode_Modified.txt
Thank you for your insights. I don't possess the hardwares to test them right now, I will test them tomorrow.
Something I did notice earlier, not sure if helpful:
getMode
function simply return v[0]
, it will display a fine image (probably not averaged yet)getMode
function return anything between v[1]
to v[8]
, it will result in similar problems.Probably worth noting old AMD driver versions on Windows used to get a corrupt image (sort of similar) with this shader about half a year ago in DX but not in OGL/Vulkan.
That is what I thought too, as I saw people having this issue with older GPUs, but here the user has macOS, which should be using either OGL and Vulkan.
That is what I thought too, as I saw people having this issue with older GPUs, but here the user has macOS, which should be using either OGL and Vulkan.
I'm really kind of clueless when it comes to MACs but I thought OGL was deprecated and Vulkan can't be used directly but instead translated to Metal via MoltenVK? No idea what MPV does though.
I have quickly modified the bilateral mode denoiser so it no longer uses the extra registers and also removed the nested for loop. This of course is bad for readability and theoretically performs worse, but might work for you. Feel free to try it when you see fit. Anime4K_Denoise_Bilateral_Mode_Modified.txt
Just tried your modified version, still doesn't seem to work.
However, I just tried manually write out the array as 9 separate variables, and expand the for loop manually, that seems to solve it. See as attached. Anime4K_Denoise_Bilateral_Mode.txt
I'm really kind of clueless when it comes to MACs but I thought OGL was deprecated and Vulkan can't be used directly but instead translated to Metal via MoltenVK? No idea what MPV does though.
https://github.com/mpv-player/mpv/pull/7482 For mac vk is still WIP, OGL is the default backend.
To Reproduce Currently the config file is set as:
Screenshots
Desktop (please complete the following information):
Is there anything I need to manually install beside mpv? I simply followed the
Installation Instruction
on the release page.Tried to apply only a single shader, seems to result in similar problem.
Just checked, the v2.1a_L works totally fine for me, but v2.1a doesn't work.