Open olidietzel opened 9 years ago
Hi,
This project supports only Mali-T6xx GPUs(OpenCL). All the performance optimization,validation etc., is done only for Mali GPUs. And it would work functionally on any OpenCL platform with Integrated GPUs such as Intel, though performance is not guaranteed on those platforms. It would not work in OpenCL platforms based on discrete cards such as Nvidia, AMD graphics cards. Please note that this project became obsolete, as the WebM libvpx improved quality significantly by changing its algorithms. We are now working on the new OpenCL project libvpx-1, based on the latest WebM libvpx's quality. libvpx-1 is not yet complete. It is a "work-in-progress". You could track that project for the latest updates.
Hi, @ittiamvpx.
You have issues at libvpx-1
closed so I hope you don't mind me asking here.
Could you please tell the current state of libvpx-1
project? Is it possible to build and run that encoder on machine with discrete GPU (e.g. nvidia)? Are you going to support discrete cards in future or you have only specific lists of cards to support (like in this project)?
Thanks!
Hi Kagami,
The GPU acceleration of vp9 encoder in the repository libvpx-1 is targeted towards real time encoding presets only and particularly for specific cpu speeds. The workspace is under development but the package as is was tested on Integrated GPU's (Mali and Intel HD Graphics) for quality and performance and is stable. We did not test on discrete graphic cards but we believe that we did not do anything in particular that limits its usage only for Integrated GPU's, As of now we do not have any a road map towards support for discrete cards.
Thanks Ram.
Hi, @ram-mohan.
Thanks for the answer.
We did not test on discrete graphic cards but we believe that we did not do anything in particular that limits its usage only for Integrated GPU
I built the most recent commit of libvpx-1
repo (https://github.com/ittiamvpx/libvpx-1/commit/14a8f3edf9c46a9847f050db53b5c99e1d9ce918) and it segfaults right after the run with --gpu
option enabled (without it everything works):
./configure --enable-opencl --opencl-lib=/opt/cuda/lib64/libOpenCL.so --disable-unit-tests --disable-vp8 --enable-debug
make -j8
./vpxenc park_joy_420_720p50.y4m --gpu --codec=vp9 -o test.webm
Trace:
Program received signal SIGSEGV, Segmentation fault.
end (worker=0x0) at vpx_util/vpx_thread.c:148
148 if (worker->impl_ != NULL) {
(gdb) bt
#0 end (worker=0x0) at vpx_util/vpx_thread.c:148
#1 0x0000000000485f87 in vp9_remove_compressor (cpi=0x7ffff6ad2020) at vp9/encoder/vp9_encoder.c:2121
#2 0x0000000000486480 in vp9_create_compressor (oxcf=oxcf@entry=0x835ca8, pool=0x83e810)
at vp9/encoder/vp9_encoder.c:1674
#3 0x00000000004770f7 in encoder_init (ctx=<optimized out>, data=<optimized out>) at vp9/vp9_cx_iface.c:812
#4 0x0000000000473e60 in vpx_codec_enc_init_ver (ctx=ctx@entry=0x824040, iface=<optimized out>,
cfg=cfg@entry=0x823c70, flags=<optimized out>, ver=ver@entry=11) at vpx/src/vpx_encoder.c:54
#5 0x0000000000403c69 in initialize_encoder (global=0x7fffffffdd60, stream=0x823c60) at vpxenc.c:1526
#6 main (argc=<optimized out>, argv_=<optimized out>) at vpxenc.c:2076
I have Nvidia GTX 970 with proprietary drivers. I also built version without multithreading and it segfaults inside vp9_aq_cyclicrefresh.c
in that case:
Program received signal SIGSEGV, Segmentation fault.
vp9_cyclic_refresh_free (cr=0x0) at vp9/encoder/vp9_aq_cyclicrefresh.c:47
47 vpx_free(cr->map);
(gdb) bt
#0 vp9_cyclic_refresh_free (cr=0x0) at vp9/encoder/vp9_aq_cyclicrefresh.c:47
#1 0x0000000000485c81 in dealloc_compressor_data (cpi=0x7ffff6ad2020) at vp9/encoder/vp9_encoder.c:372
#2 vp9_remove_compressor (cpi=0x7ffff6ad2020) at vp9/encoder/vp9_encoder.c:2131
#3 0x0000000000485fc0 in vp9_create_compressor (oxcf=oxcf@entry=0x833ca8, pool=0x83c810)
at vp9/encoder/vp9_encoder.c:1674
#4 0x0000000000476c40 in encoder_init (ctx=<optimized out>, data=<optimized out>) at vp9/vp9_cx_iface.c:812
#5 0x00000000004739c0 in vpx_codec_enc_init_ver (ctx=ctx@entry=0x822040, iface=<optimized out>,
cfg=cfg@entry=0x821c70, flags=<optimized out>, ver=ver@entry=11) at vpx/src/vpx_encoder.c:54
#6 0x00000000004037c9 in initialize_encoder (global=0x7fffffffdd60, stream=0x821c60) at vpxenc.c:1526
#7 main (argc=<optimized out>, argv_=<optimized out>) at vpxenc.c:2076
As of now we do not have any a road map towards support for discrete cards
Ok, I understand. I may provide additional debug info of my configuration/built if needed though.
Regards.
Looking at the failure it seems that the application you are running is unable to open kernel files for compilation. In the file "vp9_eopencl.c" there is a macro called PREFIX_PATH. This path helps in locating the opencl kernel files. Try modifying this relative path to open *.cl files. See if build kernel calls made in function in vp9_eopencl_init() are successful.
we recommend following configuration for encoding "./vpxenc --target-bitrate=1000 --ivf --rt --cpu-used=-6 --end-usage=cbr --undershoot-pct=50 --overshoot-pct=50 --buf-sz=1000 --buf-initial-sz=500 --buf-optimal-sz=600 --max-intra-rate=300 --limit=1000 --profile=0 --lag-in-frames=0 --min-q=2 --max-q=52 --passes=1 --kf-max-dist=99999 --kf-min-dist=0 --drop-frame=0 --static-thresh=0 --sharpness=0 --error-resilient=1 --codec=vp9 --gf-cbr-boost=200 --frame-parallel=0 --aq-mode=3 /home/testclips/gipsrestat720p.y4m --threads=1 -o out.ivf"
Thanks for your help! With that change:
diff --git a/vp9/encoder/opencl/vp9_eopencl.c b/vp9/encoder/opencl/vp9_eopencl.c
index 8e3fabf..f560155 100644
--- a/vp9/encoder/opencl/vp9_eopencl.c
+++ b/vp9/encoder/opencl/vp9_eopencl.c
@@ -17,7 +17,7 @@
#if ARCH_ARM
#define PREFIX_PATH "./"
#else
-#define PREFIX_PATH "../../vp9/encoder/opencl/"
+#define PREFIX_PATH "./vp9/encoder/opencl/"
#endif
static const int pixel_rows_per_workitem_log2_pro_me = 4;
I was able to successfully encode 1 frame of video with --gpu
option. Videos with more than 1 frame fail with different error:
(gdb) run
Starting program: vpxenc --gpu --target-bitrate=1000 --ivf --rt --cpu-used=-6 --end-usage=cbr --undershoot-pct=50 --overshoot-pct=50 --buf-sz=1000 --buf-initial-sz=500 --buf-optimal-sz=600 --max-intra-rate=300 --limit=1000 --profile=0 --lag-in-frames=0 --min-q=2 --max-q=52 --passes=1 --kf-max-dist=99999 --kf-min-dist=0 --drop-frame=0 --static-thresh=0 --sharpness=0 --error-resilient=1 --codec=vp9 --gf-cbr-boost=200 --frame-parallel=0 --aq-mode=3 2frames.y4m --threads=1 -o test.ivf
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Pass 1/1 frame 2/1 24101B 46612 us 42.91 fps [ETA 0:00:46] vpxenc: vp9/encoder/vp9_egpu.c:395: vp9_enc_sync_gpu: Assertion `gpu_output_buffer - cpi->gpu_output_pro_me_base == size' failed.
[New Thread 0x7fffe078f700 (LWP 22058)]
[New Thread 0x7fffe3fff700 (LWP 22057)]
[New Thread 0x7fffe8b1b700 (LWP 22056)]
[New Thread 0x7fffe931c700 (LWP 22055)]
[New Thread 0x7fffe9b1d700 (LWP 22054)]
[New Thread 0x7fffea31e700 (LWP 22053)]
[New Thread 0x7fffeabff700 (LWP 22052)]
[New Thread 0x7ffff397d700 (LWP 22051)]
Program received signal SIGABRT, Aborted.
0x00007ffff6dc3167 in raise () from /lib64/libc.so.6
(gdb) bt
#0 0x00007ffff6dc3167 in raise () from /lib64/libc.so.6
#1 0x00007ffff6dc44ca in abort () from /lib64/libc.so.6
#2 0x00007ffff6dbc296 in ?? () from /lib64/libc.so.6
#3 0x00007ffff6dbc342 in __assert_fail () from /lib64/libc.so.6
#4 0x00000000004a0b19 in vp9_enc_sync_gpu (cpi=cpi@entry=0x7ffff6ad2020, td=td@entry=0x7ffff6ade020,
mi_row=mi_row@entry=32, mi_row_step=mi_row_step@entry=8) at vp9/encoder/vp9_egpu.c:395
#5 0x000000000052f9bb in encode_sb_rows (mi_row_start=0, mi_row_step=8, mi_row_end=90, td=0x7ffff6ade020,
cpi=0x7ffff6ad2020) at vp9/encoder/vp9_encodeframe.c:4046
#6 encode_tiles (cpi=0x7ffff6ad2020) at vp9/encoder/vp9_encodeframe.c:4137
#7 encode_frame_internal (cpi=cpi@entry=0x7ffff6ad2020) at vp9/encoder/vp9_encodeframe.c:4349
#8 0x0000000000530551 in vp9_encode_frame (cpi=cpi@entry=0x7ffff6ad2020) at vp9/encoder/vp9_encodeframe.c:4554
#9 0x0000000000489dc8 in encode_without_recode_loop (cpi=0x7ffff6ad2020) at vp9/encoder/vp9_encoder.c:3366
#10 encode_frame_to_data_rate (cpi=cpi@entry=0x7ffff6ad2020, size=size@entry=0x7fffffffd7a8,
dest=dest@entry=0x7fffe8077010 "\203I\203B", frame_flags=frame_flags@entry=0x7fffffffd794)
at vp9/encoder/vp9_encoder.c:3870
#11 0x000000000048c1ba in Pass0Encode (frame_flags=<optimized out>, dest=<optimized out>, size=<optimized out>,
cpi=<optimized out>) at vp9/encoder/vp9_encoder.c:4022
#12 vp9_get_compressed_data (cpi=cpi@entry=0x7ffff6ad2020, frame_flags=frame_flags@entry=0x7fffffffd794,
size=size@entry=0x7fffffffd7a8, dest=dest@entry=0x7fffe8077010 "\203I\203B",
time_stamp=time_stamp@entry=0x7fffffffd798, time_end=time_end@entry=0x7fffffffd7a0, flush=1)
at vp9/encoder/vp9_encoder.c:4472
#13 0x000000000048338c in encoder_encode (ctx=0x834e40, img=0x0, pts=<optimized out>, duration=<optimized out>,
flags=<optimized out>, deadline=<optimized out>) at vp9/vp9_cx_iface.c:1060
#14 0x0000000000474340 in vpx_codec_encode (ctx=ctx@entry=0x824110, img=img@entry=0x0, pts=pts@entry=20,
duration=duration@entry=20, flags=flags@entry=0, deadline=<optimized out>) at vpx/src/vpx_encoder.c:223
#15 0x0000000000403f40 in encode_frame (global=0x7fffffffdaf0, global=0x7fffffffdaf0, global=0x7fffffffdaf0,
frames_in=2, img=0x0, stream=0x823d30) at vpxenc.c:1642
#16 main (argc=<optimized out>, argv_=<optimized out>) at vpxenc.c:2169
we recommend following configuration for encoding
Nice, thanks. It doesn't seem to include --gpu
flag though?
Yeah i notice, --gpu flag is missing. Sorry about that. There seems to be an assertion failure in function "vp9_enc_sync_gpu (file:vp9_egpu.c, line 395)". Can you share the Lvalue and Rvalue in the comparison made.
(gdb) break vp9_egpu.c:395
Breakpoint 1 at 0x4a0a2c: file vp9/encoder/vp9_egpu.c, line 395.
(gdb) run
Starting program: vpxenc --gpu --target-bitrate=1000 --ivf --rt --cpu-used=-6 --end-usage=cbr --undershoot-pct=50 --overshoot-pct=50 --buf-sz=1000 --buf-initial-sz=500 --buf-optimal-sz=600 --max-intra-rate=300 --limit=1000 --profile=0 --lag-in-frames=0 --min-q=2 --max-q=52 --passes=1 --kf-max-dist=99999 --kf-min-dist=0 --drop-frame=0 --static-thresh=0 --sharpness=0 --error-resilient=1 --codec=vp9 --gf-cbr-boost=200 --frame-parallel=0 --aq-mode=3 2frames.y4m --threads=1 -o test.ivf
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Pass 1/1 frame 2/1 24101B 47116 us 42.45 fps [ETA 0:00:47] [New Thread 0x7fffe078f700 (LWP 23165)]
[New Thread 0x7fffe3fff700 (LWP 23164)]
[New Thread 0x7fffe8b1b700 (LWP 23163)]
[New Thread 0x7fffe931c700 (LWP 23162)]
[New Thread 0x7fffe9b1d700 (LWP 23161)]
[New Thread 0x7fffea31e700 (LWP 23160)]
[New Thread 0x7fffeabff700 (LWP 23159)]
[New Thread 0x7ffff397d700 (LWP 23158)]
Breakpoint 1, vp9_enc_sync_gpu (cpi=cpi@entry=0x7ffff6ad2020, td=td@entry=0x7ffff6ade020, mi_row=mi_row@entry=0,
mi_row_step=mi_row_step@entry=8) at vp9/encoder/vp9_egpu.c:395
395 assert(gpu_output_buffer - cpi->gpu_output_pro_me_base == size);
(gdb) print gpu_output_buffer
$1 = (GPU_OUTPUT_PRO_ME *) 0x205e42200
(gdb) print cpi->gpu_output_pro_me_base
$2 = (GPU_OUTPUT_PRO_ME *) 0x205e42200
(gdb) print size
$3 = 0
(gdb) cont
Continuing.
Breakpoint 1, vp9_enc_sync_gpu (cpi=cpi@entry=0x7ffff6ad2020, td=td@entry=0x7ffff6ade020, mi_row=mi_row@entry=32,
mi_row_step=mi_row_step@entry=8) at vp9/encoder/vp9_egpu.c:395
395 assert(gpu_output_buffer - cpi->gpu_output_pro_me_base == size);
(gdb) print gpu_output_buffer
$4 = (GPU_OUTPUT_PRO_ME *) 0x205e49000
(gdb) print cpi->gpu_output_pro_me_base
$5 = (GPU_OUTPUT_PRO_ME *) 0x205e42200
(gdb) print size
$6 = 80
(gdb) cont
Continuing.
vpxenc: vp9/encoder/vp9_egpu.c:395: vp9_enc_sync_gpu: Assertion `gpu_output_buffer - cpi->gpu_output_pro_me_base == size' failed.
Program received signal SIGABRT, Aborted.
0x00007ffff6dc3167 in raise () from /lib64/libc.so.6
For 720p content Rvalue 80 is as expected. But I am unable to make much out of the Lvalue. Can you please share the sizeof(GPU_OUTPUT_PRO_ME) structure on your platform and the actual difference 'gpu_output_buffer - cpi->gpu_output_pro_me_base' you are seeing
In vp9_eopencl_alloc_buffers() memory needed for gpu interface buffers is allocated. Lines 431-465 represent allocation of a part of gpu output buffers that is currently under consideration. Looking at the buffer/sub-buffer creation and their cpu side map pointers is the key for solving this issue. As of now I do not have a set up similar that of yours to reproduce this issue. Once I get hold of it, i will look in to it.
Thanks, Ram.
Can you please share the sizeof(GPU_OUTPUT_PRO_ME) structure on your platform and the actual difference 'gpu_output_buffer - cpi->gpu_output_pro_me_base' you are seeing
I added debug prints near this line:
diff --git a/vp9/encoder/vp9_egpu.c b/vp9/encoder/vp9_egpu.c
index cb0e945..4610c75 100644
--- a/vp9/encoder/vp9_egpu.c
+++ b/vp9/encoder/vp9_egpu.c
@@ -390,8 +390,20 @@ void vp9_enc_sync_gpu(VP9_COMP *cpi, ThreadData *td, int mi_row, int mi_row_step
const int size = cm->sb_cols * sb_row;
(void) size;
+ printf("BEFORE p1=%p p2=%p diff=%ld size=%d sizeof=%zu\n",
+ gpu_output_buffer,
+ cpi->gpu_output_pro_me_base,
+ (gpu_output_buffer - cpi->gpu_output_pro_me_base),
+ size,
+ sizeof(GPU_OUTPUT_PRO_ME));
egpu->acquire_output_pro_me_buffer(cpi, (void **) &gpu_output_buffer,
subframe_idx);
+ printf("AFTER p1=%p p2=%p diff=%ld size=%d sizeof=%zu\n",
+ gpu_output_buffer,
+ cpi->gpu_output_pro_me_base,
+ (gpu_output_buffer - cpi->gpu_output_pro_me_base),
+ size,
+ sizeof(GPU_OUTPUT_PRO_ME));
assert(gpu_output_buffer - cpi->gpu_output_pro_me_base == size);
}
if (mi_row - mi_row_step == subframe.mi_row_start &&
Output:
BEFORE p1=0x6cab146c5e78fa00 p2=0x205e42200 diff=-6067348286818819520 size=0 sizeof=96
AFTER p1=0x205e42200 p2=0x205e42200 diff=0 size=0 sizeof=96
BEFORE p1=0x205e44000 p2=0x205e42200 diff=80 size=80 sizeof=96
AFTER p1=0x205e49000 p2=0x205e42200 diff=-6148914691236516912 size=80 sizeof=96
Seems like pointers are correct before acquire_output_pro_me_buffer
call on the second time but then it slightly changes and difference is not equal to 96*80
.
In vp9_eopencl_alloc_buffers() memory needed for gpu interface buffers is allocated. Lines 431-465 represent allocation of a part of gpu output buffers that is currently under consideration. Looking at the buffer/sub-buffer creation and their cpu side map pointers is the key for solving this issue.
I'll try to look into it, thanks.
Hi @ram-mohan ,
Got the same assertion error in vp9_enc_sync_gpu(). My understanding is that the vp9_opencl_map_buffer() doesn't generate continuous addresses for the mapped pointers from different sub-frames in the host memory. The reason might be clEnqueueMapBuffer() itself, or there are other host memory allocations during two map calls.
Are there any particular reasons to consider those pointers as continuous?
Thanks, mingtotti
Hi mingtotti,
Yes we were able to reproduce this issue. Like you pointed out, the host pointers for different sub buffers were not contiguous. The assumption we made was out of general intuition. It seems that this assumption is not valid as per OpenCL specification. We have made the necessary changes from our side. We will push these changes soon.
Thanks Ram.
Hi Ram,
That would be great!
Thanks, Totti
Hello, I know, its over a year - but I try to be able to encode in vp9 with gpu acceleration. So I tried it with libvpx-1. I could compile it, and also start to encode with this: vpxenc test.y4m --gpu --codec=vp9 -o test.webm
It run through the first process, but got then stock - as you see here: Pass 1/2 frame 500/501 92184B 1474b/f 73747b/s 19342 ms (25.85 fps) Pass 2/2 frame 2/0 0B 19357 ms 6.20 fpm [ETA unknown] Speicherzugriffsfehler (Speicherabzug geschrieben)
by the way - sometimes it stops later (but i added --best): vpxenc test.y4m --gpu --best --codec=vp9 -o test.webm Pass 1/2 frame 500/501 92184B 1474b/f 73747b/s 18917 ms (26.43 fps) Pass 2/2 frame 104/102 573299B 238399 ms 26.17 fpm [ETA 0:15:28] Speicherzugriffsfehler (Speicherabzug geschrieben)
When I use VP8, its different - but slow: vpxenc test.y4m --gpu --codec=vp8 -o test.webm Pass 1/1 frame 500/500 4509263B 72148b/f 3607410b/s 192750 ms (2.59 fps)
Does anyone has any Idea to help me? Thanks for you help. Dominique
For me the latest version also fails with sigsegv:
(gdb) run /home/senya/Загрузки//park_joy_420_720p50.y4m --gpu --codec=vp9 -o test.webm
Starting program: /home/senya/source/libvpx-1/vpxenc /home/senya/Загрузки//park_joy_420_720p50.y4m --gpu --codec=vp9 -o test.webm
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffee800700 (LWP 28054)]
[New Thread 0x7fffedfff700 (LWP 28055)]
[New Thread 0x7fffed7fe700 (LWP 28056)]
[New Thread 0x7fffecffd700 (LWP 28057)]
[New Thread 0x7fffec7fc700 (LWP 28058)]
[New Thread 0x7fffebffb700 (LWP 28059)]
[New Thread 0x7fffeb7fa700 (LWP 28060)]
[New Thread 0x7fffeaff9700 (LWP 28061)]
Pass 1/2 frame 1/0 0B 0 us 0.00 fpm [ETA unknown]
Thread 1 "vpxenc" received signal SIGSEGV, Segmentation fault.
__memset_sse2 () at ../sysdeps/x86_64/multiarch/../memset.S:78
78 ../sysdeps/x86_64/multiarch/../memset.S: Нет такого файла или каталога.
(gdb) where
#0 __memset_sse2 () at ../sysdeps/x86_64/multiarch/../memset.S:78
#1 0x00000000004f4582 in vp9_gpu_get_frame_buffer (
cb_priv=cb_priv@entry=0x7fffffffd2a0, min_size=min_size@entry=2496223,
fb=fb@entry=0x7fffffffd2b0) at vp9/common/vp9_gpu.c:32
#2 0x00000000004b98a8 in vpx_realloc_frame_buffer (ybf=0x132f9320,
width=width@entry=1280, height=height@entry=720, ss_x=ss_x@entry=1,
ss_y=ss_y@entry=1, border=border@entry=160, byte_alignment=64,
fb=0x7fffffffd2b0, cb=0x4f4500 <vp9_gpu_get_frame_buffer>,
cb_priv=0x7fffffffd2a0) at vpx_scale/generic/yv12config.c:182
#3 0x0000000000525117 in vp9_lookahead_init (cm=cm@entry=0x7ffff1799c80,
width=1280, height=720, subsampling_x=1, subsampling_y=1,
depth=<optimized out>) at vp9/encoder/vp9_lookahead.c:100
#4 0x0000000000481aa1 in alloc_raw_frame_buffers (cpi=0x7ffff1773020)
at vp9/encoder/vp9_encoder.c:640
#5 check_initial_width (subsampling_y=1, subsampling_x=1, cpi=0x7ffff1773020)
at vp9/encoder/vp9_encoder.c:4342
#6 vp9_receive_raw_frame (cpi=cpi@entry=0x7ffff1773020, frame_flags=0,
sd=sd@entry=0x7fffffffd410, time_stamp=0, end_time=200000)
at vp9/encoder/vp9_encoder.c:4371
#7 0x000000000046d868 in encoder_encode (ctx=0x89f710, img=0x7fffffffd7d0,
pts=0, duration=1, enc_flags=<optimized out>, deadline=<optimized out>)
at vp9/vp9_cx_iface.c:1053
#8 0x000000000046a6e7 in vpx_codec_encode (ctx=ctx@entry=0x88ca40,
---Type <return> to continue, or q <return> to quit---
I use Intel HD Graphics and latest revision of Beignet as OpenCL library.
I also tried to compile with nvidia-opencl provided in Ubuntu 18.04:
./configure --enable-opencl --opencl-lib=/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1 --disable-unit-tests --disable-vp8 --enable-debug --enable-pic
Unfortunately I get some errors on linking phase:
...
[LD] vpxdec
./libvpx.a(vp9_eopencl.c.o): In function `vp9_eopencl_remove':
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1326: undefined reference to `clReleaseEvent'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1326: undefined reference to `clReleaseEvent'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1335: undefined reference to `clReleaseKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1342: undefined reference to `clReleaseKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1349: undefined reference to `clReleaseKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1356: undefined reference to `clReleaseKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1363: undefined reference to `clReleaseKernel'
./libvpx.a(vp9_eopencl.c.o):/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1371: more undefined references to `clReleaseKernel' follow
./libvpx.a(vp9_eopencl.c.o): In function `vp9_eopencl_set_dynamic_kernel_args_pro_me':
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:247: undefined reference to `clSetKernelArg'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:249: undefined reference to `clSetKernelArg'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:255: undefined reference to `clSetKernelArg'
./libvpx.a(vp9_eopencl.c.o): In function `vp9_eopencl_execute_prologue':
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:846: undefined reference to `clEnqueueUnmapMemObject'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:883: undefined reference to `clEnqueueNDRangeKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:900: undefined reference to `clEnqueueNDRangeKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:917: undefined reference to `clEnqueueNDRangeKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:934: undefined reference to `clEnqueueNDRangeKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:968: undefined reference to `clFlush'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:857: undefined reference to `clEnqueueUnmapMemObject'
./libvpx.a(vp9_eopencl.c.o): In function `vp9_eopencl_set_dynamic_kernel_args_pro_me':
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:261: undefined reference to `clSetKernelArg'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:263: undefined reference to `clSetKernelArg'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:265: undefined reference to `clSetKernelArg'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:271: undefined reference to `clSetKernelArg'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:273: undefined reference to `clSetKernelArg'
./libvpx.a(vp9_eopencl.c.o):/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:275: more undefined references to `clSetKernelArg' follow
./libvpx.a(vp9_eopencl.c.o): In function `vp9_eopencl_execute_prologue':
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:958: undefined reference to `clEnqueueMapBuffer'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:945: undefined reference to `clEnqueueMapBuffer'
./libvpx.a(vp9_eopencl.c.o): In function `vp9_eopencl_execute':
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1032: undefined reference to `clEnqueueUnmapMemObject'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1080: undefined reference to `clEnqueueNDRangeKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1088: undefined reference to `clEnqueueMapBuffer'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1144: undefined reference to `clEnqueueNDRangeKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1163: undefined reference to `clEnqueueNDRangeKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1184: undefined reference to `clEnqueueNDRangeKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1199: undefined reference to `clEnqueueNDRangeKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1242: undefined reference to `clEnqueueNDRangeKernel'
./libvpx.a(vp9_eopencl.c.o):/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1257: more undefined references to `clEnqueueNDRangeKernel' follow
./libvpx.a(vp9_eopencl.c.o): In function `vp9_eopencl_execute':
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1279: undefined reference to `clEnqueueMapBuffer'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1289: undefined reference to `clReleaseEvent'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1295: undefined reference to `clEnqueueMarker'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1300: undefined reference to `clFlush'
./libvpx.a(vp9_eopencl.c.o): In function `vp9_eopencl_set_dynamic_kernel_args_me':
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:320: undefined reference to `clSetKernelArg'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:322: undefined reference to `clSetKernelArg'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:324: undefined reference to `clSetKernelArg'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:326: undefined reference to `clSetKernelArg'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:333: undefined reference to `clSetKernelArg'
./libvpx.a(vp9_eopencl.c.o):/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:335: more undefined references to `clSetKernelArg' follow
./libvpx.a(vp9_eopencl.c.o): In function `vp9_eopencl_execute':
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1268: undefined reference to `clEnqueueMapBuffer'
./libvpx.a(vp9_eopencl.c.o): In function `vp9_eopencl_free_buffers':
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:566: undefined reference to `clReleaseMemObject'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:572: undefined reference to `clReleaseMemObject'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:583: undefined reference to `clReleaseMemObject'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:598: undefined reference to `clReleaseMemObject'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:605: undefined reference to `clReleaseMemObject'
./libvpx.a(vp9_eopencl.c.o):/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:615: more undefined references to `clReleaseMemObject' follow
./libvpx.a(vp9_eopencl.c.o): In function `vp9_eopencl_alloc_buffers':
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:408: undefined reference to `clCreateBuffer'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:414: undefined reference to `clCreateBuffer'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:420: undefined reference to `clCreateBuffer'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:426: undefined reference to `clCreateBuffer'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:434: undefined reference to `clCreateBuffer'
./libvpx.a(vp9_eopencl.c.o):/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:442: more undefined references to `clCreateBuffer' follow
./libvpx.a(vp9_eopencl.c.o): In function `vp9_eopencl_alloc_buffers':
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:469: undefined reference to `clCreateSubBuffer'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:479: undefined reference to `clCreateBuffer'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:492: undefined reference to `clCreateBuffer'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:499: undefined reference to `clCreateBuffer'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:506: undefined reference to `clCreateBuffer'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:532: undefined reference to `clCreateSubBuffer'
./libvpx.a(vp9_eopencl.c.o): In function `vp9_eopencl_set_static_kernel_args':
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:73: undefined reference to `clSetKernelArg'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:75: undefined reference to `clSetKernelArg'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:77: undefined reference to `clSetKernelArg'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:79: undefined reference to `clSetKernelArg'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:81: undefined reference to `clSetKernelArg'
./libvpx.a(vp9_eopencl.c.o):/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:83: more undefined references to `clSetKernelArg' follow
./libvpx.a(vp9_eopencl.c.o): In function `vp9_eopencl_enc_sync_read':
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:755: undefined reference to `clWaitForEvents'
./libvpx.a(vp9_eopencl.c.o): In function `vp9_eopencl_build_choose_partitioning_kernel':
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1762: undefined reference to `clCreateProgramWithSource'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1774: undefined reference to `clBuildProgram'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1804: undefined reference to `clCreateKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1809: undefined reference to `clCreateKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1814: undefined reference to `clCreateKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1819: undefined reference to `clCreateKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1824: undefined reference to `clCreateKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1828: undefined reference to `clReleaseProgram'
./libvpx.a(vp9_eopencl.c.o): In function `vp9_eopencl_build_zeromv_kernel':
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1617: undefined reference to `clCreateProgramWithSource'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1630: undefined reference to `clBuildProgram'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1660: undefined reference to `clCreateKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1664: undefined reference to `clReleaseProgram'
./libvpx.a(vp9_eopencl.c.o): In function `vp9_eopencl_build_rd_kernel':
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1689: undefined reference to `clCreateProgramWithSource'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1702: undefined reference to `clBuildProgram'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1732: undefined reference to `clCreateKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1737: undefined reference to `clCreateKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1741: undefined reference to `clReleaseProgram'
./libvpx.a(vp9_eopencl.c.o): In function `vp9_eopencl_build_fullpel_kernel':
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1545: undefined reference to `clCreateProgramWithSource'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1558: undefined reference to `clBuildProgram'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1588: undefined reference to `clCreateKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1592: undefined reference to `clReleaseProgram'
./libvpx.a(vp9_eopencl.c.o): In function `vp9_eopencl_build_subpel_kernel':
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1468: undefined reference to `clCreateProgramWithSource'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1481: undefined reference to `clBuildProgram'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1510: undefined reference to `clCreateKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1515: undefined reference to `clCreateKernel'
/home/user/libvpx-1/vp9/encoder/opencl/vp9_eopencl.c:1520: undefined reference to `clReleaseProgram'
./libvpx.a(vp9_opencl.c.o): In function `vp9_opencl_remove':
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:181: undefined reference to `clReleaseCommandQueue'
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:187: undefined reference to `clReleaseCommandQueue'
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:193: undefined reference to `clReleaseContext'
./libvpx.a(vp9_opencl.c.o): In function `vp9_opencl_release_frame_buffers':
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:79: undefined reference to `clEnqueueUnmapMemObject'
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:83: undefined reference to `clFinish'
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:89: undefined reference to `clReleaseMemObject'
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:96: undefined reference to `clEnqueueUnmapMemObject'
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:100: undefined reference to `clFinish'
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:107: undefined reference to `clReleaseMemObject'
./libvpx.a(vp9_opencl.c.o): In function `vp9_opencl_alloc_frame_buffers':
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:28: undefined reference to `clCreateBuffer'
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:33: undefined reference to `clEnqueueMapBuffer'
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:49: undefined reference to `clCreateSubBuffer'
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:54: undefined reference to `clEnqueueMapBuffer'
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:58: undefined reference to `clReleaseMemObject'
./libvpx.a(vp9_opencl.c.o): In function `vp9_opencl_map_buffer':
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:131: undefined reference to `clEnqueueMapBuffer'
./libvpx.a(vp9_opencl.c.o): In function `vp9_opencl_unmap_buffer':
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:154: undefined reference to `clEnqueueUnmapMemObject'
./libvpx.a(vp9_opencl.c.o): In function `vp9_opencl_init':
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:230: undefined reference to `clGetPlatformIDs'
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:235: undefined reference to `clGetPlatformIDs'
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:240: undefined reference to `clGetDeviceIDs'
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:249: undefined reference to `clGetDeviceIDs'
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:256: undefined reference to `clCreateContext'
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:262: undefined reference to `clCreateCommandQueue'
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:268: undefined reference to `clCreateCommandQueue'
./libvpx.a(vp9_opencl.c.o): In function `vp9_opencl_unmap_buffer':
/home/user/libvpx-1/vp9/common/opencl/vp9_opencl.c:163: undefined reference to `clFinish'
collect2: error: ld returned 1 exit status
/home/user/libvpx-1/examples.mk:298: recipe for target 'vpxdec' failed
make[1]: *** [vpxdec] Error 1
Makefile:17: recipe for target '.DEFAULT' failed
make: *** [.DEFAULT] Error 2
Am I doing something wrong or is it an issue of the ittiamvpx/libvpx-1?
basing on the errors seen it seems that opencl library is not being linked properly. Having said that this library is not expected to work on devices with external graphic cards. we have noticed few issues when we tried on them. Those were never fixed. This library was tested on mali platforms.
Thanks!
basing on the errors seen it seems that opencl library is not being linked properly.
So it is likely that Ubuntu 18.04 provides a bogus opencl library for nvidia?
Having said that this library is not expected to work on devices with external graphic cards. we have noticed few issues when we tried on them. Those were never fixed. This library was tested on mali platforms.
You say you tried them. Were you able to successfully encode anything at all?
So it is likely that Ubuntu 18.04 provides a bogus opencl library for nvidia?. Not true. opencl lib should be fine. I think the link you are giving in the configure might be a symbolic link to actual library. You should be using the actual library. Like i said any effort to bring up this build might not be of any use because there are still issues seen in it and you wont be able to use it.
It was tested on mali platforms only. In nvidia even if the build was successfully I am assume you will see some crash behavior
Further, the gpu acceleration done here is targeted towards very low end cpu/gpus like mobile phones. As you are running ubuntu 18.04 it has to be high end cpu. So gpu stuff here may not be of any use to you.
I'm interested in parallelization of VP9 encoding. I have a high end CPU which does a good job, but I want to encode more video streams on the same machine. So the idea is to use a GPU in addition to CPU. CPU will encode VP9 streams and GPU will encode some other VP9 streams at the same time. If GPU is able to handle at least one 1080p30 VP9 encoding then it fits me.
This workspace doesnt fit your requirements.
What do you mean by workspace?
i mean this project
Why? Are you sure that GPU won't be able to handle 1080p30 VP9 encoding?
I understand there are issues with using it with discrete GPUs, but in theory me or my colleagues can fix these. If what I want to try is possible at all then this project is a good point to start, isn't it?
If i understand you correctly you want the GPU to encode an entire vp9 bitstream on its own. GPU's are not designed to do that. GPU's is not a parallel CPU. They are co-processors that do some tasks better than cpu and few other tasks worse than CPU. In the processing of vp9 encoding, we identify algorithms that can perform well on a gpu in comparison with cpu and these are offloaded to gpu. CPU does the actual encoding, but few portions of this encoding process is moved to GPU because they are better. This way we get some performance gain. In this project, we moved one or two modules in real-time encoding preset to gpu and we saw some gains in mali platforms. These gains are not universal. They may not be seen on other gpus. I am not certain if this is a good place to start. All i can say is GPU acceleration is really tricky. good luck..
Oh, I see now. Thanks for the explanation!
So in theory a discrete GPU can offload some part of the encoding process from Intel CPU, but it will not necessary make an improvement, while the actual bottleneck can be different from what you had in your Mali case.
So first of all I have to find out whether there is a bottleneck in a recent Intel CPU which can be resolved by offloading to GPU. And if there is not then the thing will not work. And if there is then I need to do the similar work as you did for libvpx but for my case.
Hi, is it possible to use your opencl based vp9 encoder in ffmpeg on a regular x86-64 linux install?
Tried fedora with nvidia opencl on a 960 maxwell 2 gpu, was able to install and test opencl, but had errors when trying to encode vp9 with your libvpx version compiled into ffmpeg, crashed.
Sorry for asking, i was not enough a coder to debug this on my own! :)