Open bjin opened 7 years ago
Hi bjin,
Thanks for reaching out. The RAVU algorithm looks interesting. I'd definitely be interested in trying to get it to work with MPDN. The results on that comparison picture looked quite good.
By the way you said you made a chroma version that looked at the interpolated luma channel. Have you tried one that looks at the full resolution luma and uses that to scale the chroma? (you'll probably need 4(?) different sets of weights to handle all cases).
I'm also intrigued by the SPIR-V cross compiling, would it in theory work for all MPV user shaders? Does it work both ways?
Unfortunately I'm in the middle of some changes to the extension framework at the moment, so it could take a while before I get round to actually implementing RAVU.
Cheers, Shiandow
Have you tried one that looks at the full resolution luma and uses that to scale the chroma? (you'll probably need 4(?) different sets of weights to handle all cases).
The quality of luma information (for ravu-chroma
) is not that important for current model. It only matters on calculation of the discrete key (angle, strength, coherence)
.
We could modify the model and makes chroma channel directly depending on luma channel, but I don't think simple linear model would work here. The semantic of luma and chroma are quite different. However, a deep NN model could probably handle this well.
I'm also intrigued by the SPIR-V cross compiling, would it in theory work for all MPV user shaders?
The GLSL->SPIRV
part is actually quite essential. It's the reference implementation of SPIRV compiler. The SPIRV->HLSL
part, on the other hand, is experimental. I think in theory it should be enough to cover the functionality set mpv uses (targeting d3d11 or d3d12) though, including UBO/SSBO and compute shader.
Does it work both ways?
I don't know. But glslang supports HLSL as well. So, could be.
The quality of luma information (for
ravu-chroma
) is not that important for current model. It only matters on calculation of the discrete key(angle, strength, coherence)
.
With my own experiments on luma guided chroma scaling I encountered a few cases where some features got lost when you downsampled the luma. This tended to cause some problems.
Anyway looks like I should look into SPIR-V some time. Although for now it's probably best to try to port RAVU first and see if that process can be streamlined a bit.
I recently implemented a prescaler for mpv named RAVU, based on RAISR (Rapid and Accurate Image Super Resolution). After several iteration of improvement/performance tuning, I consider the current code feature complete and somehow stable (no major change to shader planned, probably only improvement of model weights). I also got report recently that, RAVU works fine with the current work-in-progress native Direct3D 11 renderer of mpv (
HLSL
cross compiled byGLSL->SPIRV->HLSL
). So I now believe it's a viable option to have RAVU ported as MPDN's renderscript.Currently, RAVU is tuned for anime only. The linear regression method it used is just too simple to fit both anime-style picture and live action photos. I also tried to train RAVU on real photos but the result (validated with independent selection of real photos) is not impressive, just slightly better than EWA scalers visually. You can find comparison on anime-style pictures of NNEDI3 and RAVU here. However, performance-wise, RAVU with
radius=3
is like 5 times faster than NNEDI3 withneurons=32
, see details here.I will explain the basics about how RAVU works, and if anyone is interested, I could explain more details.
n * n
pixels in neighborhood (n=2*r
for ravu, andn=2*r-1
for ravu-lite).(angle, strength, coherence)
vec4/float4
array) to obtain a convolution kernel.n * n
pixels and get the result.However, there are several variants of RAVU to fit different scenarios. I think
luma
andrgb
variant of RAVU, and RAVU-lite is most interesting.-yuv
, use the first channel (luma
channel) to calculate gradient, and upscale all three planes.-rgb
, calculate luma channel fromrgb
channels to calculate gradient, and upscale all three planes (most universal since all video will be converted to RGB in mpv before upscaling).-chroma
, sample luma channel separately from source plane (with bilinear texture sampling), and upscale chroma planes.gather
version utilizingtextureGatherOffset
(GatherRed
in HLSL) andcompute
version utilizing compute shader (DirectCompute
) for further performance improvement.I also cross compiled some sample shaders into HLSL for reference purpose. The (unrolled) coordinates are currently generated by python script.
ravu-lite-r3
: pass1 (GLSL HLSL) combine_pass(GLSL HLSL)ravu-r3
: pass1 (GLSL HLSL) pass2 (GLSL HLSL) pass3 (GLSL HLSL) combine_pass (GLSL HLSL)EDIT: I don't have environment/knowledge to develop HLSL shader and C# script for MPDN. But I'm happy to provide help to make porting easier.