Jaded-Encoding-Thaumaturgy / vs-denoise

VapourSynth denoising, regression, and motion compensation functions
MIT License
18 stars 6 forks source link

Investigate under what circumstances FFT3D can be replaced with DFTTest #71

Closed NSQY closed 11 months ago

NSQY commented 1 year ago

FFT3D is prone to all kinds of artifacting and is almost certainly slower than https://github.com/AmusementClub/vs-dfttest2

FFT3D: https://slow.pics/c/zgSUR8pL DFTTest: https://slow.pics/c/8UjslLye

Setsugennoao commented 1 year ago

For what functions, exactly?

NSQY commented 1 year ago

For what functions, exactly?

Many functions that use MVTools will use some kind of prefilter before calculating motion vectors, FFT3D was a popular choice. Some of the advanced features may not have been used and/or could be replaced with other filters.

adworacz commented 1 year ago

Another thing I'll note is that I find modern versions of dfttest (like dfttest2) are pretty fast, often 2x faster than FFT3D. So faster speeds with higher quality is pretty win-win.

But, the sigma values for fft3dfilter vs dfttest are quite different for similar levels of smoothing. For instance, sigma 8-9 in fft3dfilter is ~100 in dfttest. There might be a more accurate math for calculating it, but it's certainly not 1:1.

Another thing worth noting - the effect of bit depth, sigma and fft3dfilter implementations matter.

As part of my work on the TemporalDegrain2 port, I noticed that neo_fft3d seems to scale sigma based on the bit depth. Meaning that sigma=10 looks the same for 8 bit content and 16 bit content. However, the "old" fft3dfilter (not neo) does not inherently scale this bit depth, so it needs to be done manually. Specifically it needs to be scaled like so: sigma * (1 << bitDepth - 8), so for sigma=10 to have the same effect 16-bit content, it looks like 10 * 256.

I'll be compensating for this in my TemporalDegrain2 port (which is complete, but just needs some small refactoring), but figured I'd mention it here as well.

NSQY commented 1 year ago

But, the sigma values for fft3dfilter vs dfttest are quite different for similar levels of smoothing. For instance, sigma 8-9 in fft3dfilter is ~100 in dfttest. There might be a more accurate math for calculating it, but it's certainly not 1:1.

This is unfortunately quite common between various noise reduction filters (though, really, only a select few filters are commonly being used). I suppose in theory we could measure some samples with PSNR or something and get a rough approximation of the offset between sigmas.

Generally I find DFTTest favorable overall, it has a good speed:quality ratio and has few downsides compared to other options; NLMeans is slower and will ghost easily at high strengths (can be offset with mvtools compensation but that's just more cpu cycles) FFT3D filter is slower as you have stated and, aside from perhaps some advanced usage, offers few benefits BM3D has better quality than all the aforementioned but at a significant compute cost, ridiculous for pre-filtering IMO.

NSQY commented 1 year ago

@adworacz Do you happen to have any insight in regards to the origin of the sigma values being used for fft3dfilter? https://github.com/Irrational-Encoding-Wizardry/vs-denoise/blob/master/vsdenoise/limit.py#L98-L103

I did some digging, the earliest usage of these values are from a user named g-force: https://forum.doom9.org/showthread.php?p=1123324#post1123324

By 2009, these values were incorporated into the AVS function: https://pastebin.com/raw/0n7Xg4yK

You can track down some of their other scripts on the originaltrilogy forum: https://originaltrilogy.com/topic/GOUT-image-stabilization-Released/id/9038/page/1#309478

In older versions, such as V4 and V3.17, generic values of sigma2=sigma*.75,sigma3=sigma*.5,sigma4=sigma*.25 were being used. At some point these were changed to what we know today. If I am not mistaken, this was merely an arbitrary decision to use the halfway point between the two.

>>> [n / 4 for n in [1, 2, 3]]
[0.25, 0.5, 0.75]

>>> [n / 8 for n in [2, 3, 5]]
[0.25, 0.375, 0.625]
adworacz commented 1 year ago

I have limited insight. What I do have is that the the sigma(2,3,4) parameters correspond with lower and lower frequencies. Since video noise is usually high frequency, it makes sense to have the highest sigma at the highest frequencies, and then to scale it down on lower frequencies.

This is the same approach we take with dfttest with the slocation parameter, although that parameter is a lot more powerful.

Setsugennoao commented 11 months ago

Stale