Netflix / vmaf

Perceptual video quality assessment based on multi-method fusion.
Other
4.37k stars 737 forks source link

Strange issue observed calculating VMAF metric values using libvmaf in CUDA mode #1180

Open avstrakhov opened 1 year ago

avstrakhov commented 1 year ago

I'm developing SW library and using libvmaf successfully for vmaf metric calculation. Unlike vmaf tool application, in my case I request a calculated vmaf metric value for frame N after submitting data for frame N+1 using a call to vmaf_score_at_index() routine. All is working fine. Sample output in CPU mode:

$ ~/yama/build/calc_metrics -n 10 -m vmaf -w 832 -h 480 ~/0/PartyScene_832x480_50.yuv ~/0/PartyScene_832x480.x264.VS.900.264.yuv VMAF;VMAF_avg; 91.549303;91.549303; 91.563187;91.556245; 91.568763;91.560418; 91.072508;91.438440; 91.768058;91.504364; 90.827978;91.391633; 93.576076;91.703696; 89.840477;91.470794; 90.957639;91.413777; 90.588053;91.331204;

Recently I added code to enable great new feature added to libvmaf (support for CUDA GPU offloading). Using vmaf tool sources as an example I added only a call to vmaf_cuda_init() routine and found the strange issue:

$ ~/yama/build/calc_metrics -n 10 --vmaf-cuda -m vmaf -w 832 -h 480 ~/0/PartyScene_832x480_50.yuv ~/0/PartyScene_832x480.x264.VS.900.264.yuv VMAF;VMAF_avg; extract_fex_cuda() finished for frame 0 // it's output of printf added immediately before return from extract_fex_cuda() extract_fex_cuda() finished for frame 1 91.549303;91.549303; extract_fex_cuda() finished for frame 2 libvmaf ERROR vmaf_predict_score_at_index(): no feature 'VMAF_integer_feature_motion2_score' at index 1

Just for test, I added 1 ms delay before the call to vmaf_score_at_index() in my code and all results were calculated correctly without the errors.

$ ~/yama/build/calc_metrics -n 10 --vmaf-cuda -c ccode -m vmaf -w 832 -h 480 ~/0/PartyScene_832x480_50.yuv ~/0/PartyScene_832x480.x264.VS.900.264.yuv VMAF;VMAF_avg; extract_fex_cuda finished for frame 0 extract_fex_cuda finished for frame 1 91.549303;91.549303; extract_fex_cuda finished for frame 2 91.563187;91.556245; extract_fex_cuda finished for frame 3 91.568763;91.560418; extract_fex_cuda finished for frame 4 91.072508;91.438440; extract_fex_cuda finished for frame 5 91.768058;91.504364; extract_fex_cuda finished for frame 6 90.827978;91.391633; extract_fex_cuda finished for frame 7 93.576076;91.703696; extract_fex_cuda finished for frame 8 89.840477;91.470794; extract_fex_cuda finished for frame 9 90.957639;91.413777; 90.588053;91.331204;

How should I get calculated vmaf metric values ASAP and fix the issue in CUDA mode? Please advice me.

My configuration: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz NVIDIA GeForce GT 1030: 1 x 3 @ 1518Mhz (OpenCL 3.0 CUDA) x86-64 Ubuntu 22.04.1 Linux CUTA toolkit 12.1 vmaf: commit bf80018c3f21151b7c90324da95ef2de80f49297 (HEAD -> master, origin/master, origin/HEAD) Author: Maximilian Müller maximilianm@nvidia.com Date: Tue Mar 14 19:10:20 2023 +0100

reenable CPU mutlithreading in combination with CUDA
avstrakhov commented 1 year ago

Sorry, CUDA Toolkit 12.1

kylophone commented 1 year ago

The call to vmaf_score_at_index will fail if the feature scores have not been written yet, this is a synchronization issue. This can happen whenever the feature extractors are running in a different CPU thread or now on a GPU. I guess we have two ways to solve this from the libvmaf API:

avstrakhov commented 1 year ago

Thank you very much for the quick reply! It seems like in our case the metric value for frame 1 was already calculated by CUDA kernels and written at the time when the error reported:

extract_fex_cuda() finished for frame 0
extract_fex_cuda() finished for frame 1 // it's output of printf added immediately before return from extract_fex_cuda()
91.549303;91.549303;
extract_fex_cuda() finished for frame 2
libvmaf ERROR vmaf_predict_score_at_index(): no feature 'VMAF_integer_feature_motion2_score' at index 1
avstrakhov commented 1 year ago

Any solution will be greatly appreciated.