Closed Frankziyi closed 11 months ago
Is this reproducible? Could you share your command lines for both versions?
EDIT: could you share your system as well?
Perhaps a repeat of https://github.com/Netflix/vmaf/issues/771? If you are on x86, have you compiled with AVX2/AVX-512?
Is this reproducible? Could you share your command lines for both versions?
EDIT: could you share your system as well?
system: x86_64 linux. command line: ffmpeg -y -r 30 -i input.mp4 -i input.mp4 -an -vsync 0 -pix_fmt yuv420p -filter_complex "[0:v]setpts=0[s0];[1:v]setpts=0[d0];[s0][d0]libvmaf="model_path=./vmaf_v0.6.1.json:log_path=yy.log:n_threads=4:n_subsample=4:shortest=1"" -f null - both version use this command line, the differences are the ffmpeg and vmaf model version
Does your build have AVX2 and/or AVX-512 enabled?
I compared v1.3.14 with v2.1.0 for the same input command lines:
time vmaf --reference out2.yuv --distorted out2.yuv -p 420 -w 480 -h 848 --threads 4 --subsample 4 -b 8 --model version=vmaf_v0.6.1
VMAF version 2e1b24d
1800 frames ⡃⠀ 96.70 FPS
vmaf_v0.6.1: 99.986547
real 0m4.445s
user 0m17.888s
sys 0m0.784s
time wrapper/vmafossexec yuv420p 480 848 ../out2.yuv ../out2.yuv model/vmaf_v0.6.1.pkl --thread 4 --subsample 4
Start calculating VMAF score...
Exec FPS: 332.059510
VMAF score = 99.992483
real 0m5.431s
user 1m41.660s
sys 0m1.588s
I am really confused :(
Does your build have AVX2 and/or AVX-512 enabled?
If you are using an FFmpeg binary that was built with the libvmaf filter enabled, e.g. from here, is it possible to check whether AVX2 and/or AVX-512 is enabled?
Does your build have AVX2 and/or AVX-512 enabled?
If you are using an FFmpeg binary that was built with the libvmaf filter enabled, e.g. from here, is it possible to check whether AVX2 and/or AVX-512 is enabled?
Thanks for your reply~Is this for windows?Sadly, I worked on linux. And I test on v2.0.0 again
time vmaf --reference ../vmaf/out2.yuv --distorted ../vmaf/out2.yuv -p 420 -w 480 -h 848 --threads 4 --subsample 4 -b 8 --model version=vmaf_v0.6.1
VMAF version 9db0c56
1800 frames ⡃⠀ 61.39 FPS
vmaf_v0.6.1: 99.986547
real 0m7.294s
user 0m28.704s
sys 0m0.732s
v2.1.0 gains nearly 2x speed-up compared with v2.0.0. But v1.3.14 is faster than v2.0.0. Is there something wrong?
@Frankziyi Just to clarify, my message was not directed towards you. Rather I was asking @kylophone if I can check whether AVX2 and/or AVX-512 is enabled, as I downloaded an FFmpeg binary from the page that I linked.
I got the result above when bulit with enable_asm=true, enable_avx512=false. Both Xeon ubuntu 16.04 and Core ubuntu 18.04 test came to the same conclusion that v1.3.14 is nearly the same speed with v2.0.0.
I got the result above when bulit with enable_asm=true, enable_avx512=false. Both Xeon ubuntu 16.04 and Core ubuntu 18.04 test came to the same conclusion that v1.3.14 is nearly the same speed with v2.0.0.
Can you please post your command-lines?
I got the result above when bulit with enable_asm=true, enable_avx512=false. Both Xeon ubuntu 16.04 and Core ubuntu 18.04 test came to the same conclusion that v1.3.14 is nearly the same speed with v2.0.0.
Can you please post your command-lines?
Oh, I see them here: https://github.com/Netflix/vmaf/issues/800#issuecomment-758465432. Your numbers are pasted below.
time vmaf --reference out2.yuv --distorted out2.yuv -p 420 -w 480 -h 848 --threads 4 --subsample 4 -b 8 --model version=vmaf_v0.6.1
VMAF version 2e1b24d
1800 frames ⡃⠀ 96.70 FPS
vmaf_v0.6.1: 99.986547
real 0m4.445s
user 0m17.888s
sys 0m0.784s
time wrapper/vmafossexec yuv420p 480 848 ../out2.yuv ../out2.yuv model/vmaf_v0.6.1.pkl --thread 4 --subsample 4
Start calculating VMAF score...
Exec FPS: 332.059510
VMAF score = 99.992483
real 0m5.431s
user 1m41.660s
sys 0m1.588s
4.4 / 5.4 = 81% real time 17.8 / 101.6 = 17.5% cpu time 0.78 / 1.58 = 49.3% sys time
These are sizable reductions. I've never actually run benchmarks with subsample on, and benchmarking has so far always been done single threaded. When we were doing this work, we went after a reduction of CPU time, not latency. Threading is pretty good in this library, but there are things that could probably be done to improve latency during a multi-threaded run, especially during runs using many threads (i.e. >4). Your sample looks to be pretty low resolution and not too long (4 or 5 seconds total processing). My guess is gains will be a lot more measurable if libvmaf gets more pixels (resolution and/or length).
Disappointingly, I am seeing similar results for ffmpeg+libvmaf with version 1.3.14 vs 2.3.0. If anything, 2.3.0 is slower. This is on AWS EC2 instances of type M which have AVX-512 chipsets.
I've read this thread and trying to determine how to compile ffmpeg / libvmaf for AVX-512, but it is not entirely clear to me how to do it. Can someone indicate how to go about it?
Is it just a question of adding a -Denable_avx512=true
to the meson build --buildtype release
line in the compile instructions at https://github.com/Netflix/vmaf/tree/master/libvmaf?
Disappointingly, I am seeing similar results for ffmpeg+libvmaf with version 1.3.14 vs 2.3.0. If anything, 2.3.0 is slower. This is on AWS EC2 instances of type M which have AVX-512 chipsets.
See https://github.com/Netflix/vmaf/issues/800#issuecomment-771844884. You may be comparing single threaded execution to multithreaded execution. You'll need to match the number of threads you are using, libvmaf v2.* defaults to single threaded while the older libvmaf would default to using every thread available.
I've read this thread and trying to determine how to compile ffmpeg / libvmaf for AVX-512, but it is not entirely clear to me how to do it. Can someone indicate how to go about it?
Is it just a question of adding a
-Denable_avx512=true
to themeson build --buildtype release
line in the compile instructions at https://github.com/Netflix/vmaf/tree/master/libvmaf?
Yes, that will enable the AVX-512 optiimzations.
See #800 (comment). You may be comparing single threaded execution to multithreaded execution. You'll need to match the number of threads you are using, libvmaf v2.* defaults to single threaded while the older libvmaf would default to using every thread available.
I believe I'm comparing like for like. With libvmaf 2.*, I'm setting n_threads to 40 (set with Python's os.cpu_count())
I've read this thread and trying to determine how to compile ffmpeg / libvmaf for AVX-512, but it is not entirely clear to me how to do it. Can someone indicate how to go about it? Is it just a question of adding a
-Denable_avx512=true
to themeson build --buildtype release
line in the compile instructions at https://github.com/Netflix/vmaf/tree/master/libvmaf?Yes, that will enable the AVX-512 optiimzations.
What would happen if I was compiling with enable_avx512 but running on a different instance type that doesn't have that chipset?
What would happen if I was compiling with enable_avx512 but running on a different instance type that doesn't have that chipset?
There is runtime detection, so you will fall back to either AVX2 or C.
Great. In that case, can I ask why it's not the default in the meson config?
Hi @wabiloo, sorry for the extremely delayed reply - we recently merged #1206 enabling AVX-512 by default. We had some concerns in the past about results not exactly matching between the different codepaths but I believe it's all resolved now after #1200.
Closing this thread since I believe this was the only open question, but please feel free to open a new issue with further questions or if there are problems with the default change.
I used the vmaf with ffmpeg, seting num_threads=4 both for v2.1.0 and v1.3.14. Sadly, I found that the speed is not faster than v1.3.14. I checked the vmaf with command, but there was no problem. It was faster than v2.0.0, so I think libvmaf.a should be OK. Did anyone have this problem?