Jaded-Encoding-Thaumaturgy / vs-aa

VapourSynth anti aliasing and scaling functions
MIT License
5 stars 1 forks source link

Eedi3: Set opt=3 by default if `mclip` used #21

Closed LightArrowsEXE closed 1 month ago

LightArrowsEXE commented 1 month ago

Based on testing from the discord, opt=3 seems to give reliable speed boosts compared over opt=0 if mclip is used regardless of CPU. Furthermore, avx512 seems to ALWAYS perform worse than avx. The latter is not part of this PR, but worth keeping in mind for future PRs.

Zander (7950x):

opt=0 - encoded 200 frames in 0:00:35.65 (5.61 fps) (auto)
opt=1 - encoded 200 frames in 0:00:40.24 (4.97 fps) (c)
opt=2 -encoded 200 frames in 0:00:27.95 (7.15 fps) (sse2)
opt=3 - encoded 200 frames in 0:00:26.53 (7.54 fps) (sse4.1)
opt=4 - encoded 200 frames in 0:00:31.01 (6.45 fps) (avx)
opt=5 - encoded 200 frames in 0:00:35.67 (5.61 fps) (avx512)

sinc:

Format     Function                  Parameters                     Threads  Avg Time (s)    Efficiency   FPS        CPU Usage (%)
----------------------------------------------------------------------------------------------------------------------------------
GRAY16     basedaa                   opt=0                          1        32.2918         1.0000       6.19       25.80        
GRAY16     basedaa                   opt=1                          1        30.9098         1.0000       6.47       19.50        
GRAY16     basedaa                   opt=2                          1        27.6194         1.0000       7.24       15.70        
GRAY16     basedaa                   opt=3                          1        26.9599         1.0000       7.42       15.20        
GRAY16     basedaa                   opt=4                          1        26.8888         1.0000       7.44       22.70        
GRAY16     basedaa                   opt=5                          1        30.3770         1.0000       6.58       18.20        
-----------------------------------------------------------------------------------------------------------------------------------

noiy (7700x - 32GB - Linux 6.10.7):

Raw Eedi3.aa (same settings as based_aa) 1080p vspreview Benchmark:
opt=5 - 200 frames in 00:09.428 21,2116 fps, 500 frames in 00:23.313 21,4466 fps
opt=4 - 200 frames in 00:08.598 23,2611 fps, 500 frames in 00:21.762 22,9749 fps
opt=3 - 200 frames in 00:13.884 14,4043 fps, 500 frames in 00:34.434 14,5204 fps
opt=2 - 200 frames in 00:13.866 14,4231 fps, 500 frames in 00:34.359 14,5368 fps
based_aa(jpbd16, supersampler=Gaussian) vspreview Benchmark:
opt=5 - 500 frames in 01:12.889 6,8597 fps
opt=4 - 500 frames in 01:06.161 7,5573 fps
opt=3 - 500 frames in 00:47.940 10,4296 fps, 750 frames in 01:16.386 9,8185 fps
opt=2 - 500 frames in 00:47.701 10,4819 fps, 750 frames in 01:15.825 9,8911 fps

smol (i9-13900k (-100mv undervolt, ICCMax=307A PL1=165 PL2=253, new bios)):

250 frames

opt=1/c | FPS: 9.9963
opt=2/sse2 | FPS: 10.1194
opt=3/sse4.1 | FPS: 10.1827
opt=4/avx | FPS: 8.2213 <- picked by opt=0
opt=5/avx512 | unavailable on alder lake

basic: (first is 3, second is 0) image