The vmaf v2.1.0 in ffmpeg is not faster than v1.3.14

Frankziyi commented 3 years ago

I used the vmaf with ffmpeg, seting num_threads=4 both for v2.1.0 and v1.3.14. Sadly, I found that the speed is not faster than v1.3.14. I checked the vmaf with command, but there was no problem. It was faster than v2.0.0, so I think libvmaf.a should be OK. Did anyone have this problem?

li-zhi commented 3 years ago

Is this reproducible? Could you share your command lines for both versions?

EDIT: could you share your system as well?

kylophone commented 3 years ago

Perhaps a repeat of https://github.com/Netflix/vmaf/issues/771? If you are on x86, have you compiled with AVX2/AVX-512?

Frankziyi commented 3 years ago

Is this reproducible? Could you share your command lines for both versions?

EDIT: could you share your system as well?

system: x86_64 linux. command line: ffmpeg -y -r 30 -i input.mp4 -i input.mp4 -an -vsync 0 -pix_fmt yuv420p -filter_complex "[0:v]setpts=0[s0];[1:v]setpts=0[d0];[s0][d0]libvmaf="model_path=./vmaf_v0.6.1.json:log_path=yy.log:n_threads=4:n_subsample=4:shortest=1"" -f null - both version use this command line, the differences are the ffmpeg and vmaf model version

kylophone commented 3 years ago

Does your build have AVX2 and/or AVX-512 enabled?

Frankziyi commented 3 years ago

I compared v1.3.14 with v2.1.0 for the same input command lines:

time vmaf --reference out2.yuv --distorted out2.yuv -p 420 -w 480 -h 848 --threads 4 --subsample 4 -b 8 --model version=vmaf_v0.6.1
VMAF version 2e1b24d
1800 frames ⡃⠀ 96.70 FPS
vmaf_v0.6.1: 99.986547

real    0m4.445s
user    0m17.888s
sys     0m0.784s

time wrapper/vmafossexec yuv420p 480 848 ../out2.yuv ../out2.yuv model/vmaf_v0.6.1.pkl --thread 4 --subsample 4 
Start calculating VMAF score...
Exec FPS: 332.059510
VMAF score = 99.992483

real    0m5.431s
user    1m41.660s
sys     0m1.588s

I am really confused :(

CrypticSignal commented 3 years ago

Does your build have AVX2 and/or AVX-512 enabled?

If you are using an FFmpeg binary that was built with the libvmaf filter enabled, e.g. from here, is it possible to check whether AVX2 and/or AVX-512 is enabled?

Frankziyi commented 3 years ago

Does your build have AVX2 and/or AVX-512 enabled?

If you are using an FFmpeg binary that was built with the libvmaf filter enabled, e.g. from here, is it possible to check whether AVX2 and/or AVX-512 is enabled?

Thanks for your reply~Is this for windows?Sadly, I worked on linux. And I test on v2.0.0 again

time vmaf --reference ../vmaf/out2.yuv --distorted ../vmaf/out2.yuv -p 420 -w 480 -h 848 --threads 4 --subsample 4 -b 8 --model version=vmaf_v0.6.1 
VMAF version 9db0c56
1800 frames ⡃⠀ 61.39 FPS
vmaf_v0.6.1: 99.986547

real    0m7.294s
user    0m28.704s
sys     0m0.732s

v2.1.0 gains nearly 2x speed-up compared with v2.0.0. But v1.3.14 is faster than v2.0.0. Is there something wrong?

CrypticSignal commented 3 years ago

@Frankziyi Just to clarify, my message was not directed towards you. Rather I was asking @kylophone if I can check whether AVX2 and/or AVX-512 is enabled, as I downloaded an FFmpeg binary from the page that I linked.

Frankziyi commented 3 years ago

I got the result above when bulit with enable_asm=true, enable_avx512=false. Both Xeon ubuntu 16.04 and Core ubuntu 18.04 test came to the same conclusion that v1.3.14 is nearly the same speed with v2.0.0.

kylophone commented 3 years ago

I got the result above when bulit with enable_asm=true, enable_avx512=false. Both Xeon ubuntu 16.04 and Core ubuntu 18.04 test came to the same conclusion that v1.3.14 is nearly the same speed with v2.0.0.

Can you please post your command-lines?

kylophone commented 3 years ago

I got the result above when bulit with enable_asm=true, enable_avx512=false. Both Xeon ubuntu 16.04 and Core ubuntu 18.04 test came to the same conclusion that v1.3.14 is nearly the same speed with v2.0.0.

Can you please post your command-lines?

Oh, I see them here: https://github.com/Netflix/vmaf/issues/800#issuecomment-758465432. Your numbers are pasted below.

time vmaf --reference out2.yuv --distorted out2.yuv -p 420 -w 480 -h 848 --threads 4 --subsample 4 -b 8 --model version=vmaf_v0.6.1
VMAF version 2e1b24d
1800 frames ⡃⠀ 96.70 FPS
vmaf_v0.6.1: 99.986547

real    0m4.445s
user    0m17.888s
sys     0m0.784s

time wrapper/vmafossexec yuv420p 480 848 ../out2.yuv ../out2.yuv model/vmaf_v0.6.1.pkl --thread 4 --subsample 4 
Start calculating VMAF score...
Exec FPS: 332.059510
VMAF score = 99.992483

real    0m5.431s
user    1m41.660s
sys     0m1.588s

4.4 / 5.4 = 81% real time 17.8 / 101.6 = 17.5% cpu time 0.78 / 1.58 = 49.3% sys time

These are sizable reductions. I've never actually run benchmarks with subsample on, and benchmarking has so far always been done single threaded. When we were doing this work, we went after a reduction of CPU time, not latency. Threading is pretty good in this library, but there are things that could probably be done to improve latency during a multi-threaded run, especially during runs using many threads (i.e. >4). Your sample looks to be pretty low resolution and not too long (4 or 5 seconds total processing). My guess is gains will be a lot more measurable if libvmaf gets more pixels (resolution and/or length).

wabiloo commented 2 years ago

Disappointingly, I am seeing similar results for ffmpeg+libvmaf with version 1.3.14 vs 2.3.0. If anything, 2.3.0 is slower. This is on AWS EC2 instances of type M which have AVX-512 chipsets.

I've read this thread and trying to determine how to compile ffmpeg / libvmaf for AVX-512, but it is not entirely clear to me how to do it. Can someone indicate how to go about it?

Is it just a question of adding a -Denable_avx512=true to the meson build --buildtype release line in the compile instructions at https://github.com/Netflix/vmaf/tree/master/libvmaf?

kylophone commented 2 years ago

Disappointingly, I am seeing similar results for ffmpeg+libvmaf with version 1.3.14 vs 2.3.0. If anything, 2.3.0 is slower. This is on AWS EC2 instances of type M which have AVX-512 chipsets.

See https://github.com/Netflix/vmaf/issues/800#issuecomment-771844884. You may be comparing single threaded execution to multithreaded execution. You'll need to match the number of threads you are using, libvmaf v2.* defaults to single threaded while the older libvmaf would default to using every thread available.

I've read this thread and trying to determine how to compile ffmpeg / libvmaf for AVX-512, but it is not entirely clear to me how to do it. Can someone indicate how to go about it?

Is it just a question of adding a -Denable_avx512=true to the meson build --buildtype release line in the compile instructions at https://github.com/Netflix/vmaf/tree/master/libvmaf?

Yes, that will enable the AVX-512 optiimzations.

wabiloo commented 2 years ago

See #800 (comment). You may be comparing single threaded execution to multithreaded execution. You'll need to match the number of threads you are using, libvmaf v2.* defaults to single threaded while the older libvmaf would default to using every thread available.

I believe I'm comparing like for like. With libvmaf 2.*, I'm setting n_threads to 40 (set with Python's os.cpu_count())

I've read this thread and trying to determine how to compile ffmpeg / libvmaf for AVX-512, but it is not entirely clear to me how to do it. Can someone indicate how to go about it? Is it just a question of adding a -Denable_avx512=true to the meson build --buildtype release line in the compile instructions at https://github.com/Netflix/vmaf/tree/master/libvmaf?

Yes, that will enable the AVX-512 optiimzations.

What would happen if I was compiling with enable_avx512 but running on a different instance type that doesn't have that chipset?

kylophone commented 2 years ago

What would happen if I was compiling with enable_avx512 but running on a different instance type that doesn't have that chipset?

There is runtime detection, so you will fall back to either AVX2 or C.

wabiloo commented 2 years ago

Great. In that case, can I ask why it's not the default in the meson config?

nilfm commented 11 months ago

Hi @wabiloo, sorry for the extremely delayed reply - we recently merged #1206 enabling AVX-512 by default. We had some concerns in the past about results not exactly matching between the different codepaths but I believe it's all resolved now after #1200.

Closing this thread since I believe this was the only open question, but please feel free to open a new issue with further questions or if there are problems with the default change.

Netflix / vmaf

The vmaf v2.1.0 in ffmpeg is not faster than v1.3.14 #800