doe300 / VC4CL

OpenCL implementation running on the VideoCore IV GPU of the Raspberry Pi models
MIT License
729 stars 81 forks source link

Error on RPI3 when running OpenCV4 DNN module #59

Closed pnduffy closed 5 years ago

pnduffy commented 5 years ago

I compiled and installed VC4CL running OpenCV 4.01, Qt5 and LLVM3.9 packages installed, and when I run my program I get this error:

[ INFO:0] Initialize OpenCL runtime... OpenCV(ocl4dnn): consider to specify kernel configuration cache directory via OPENCV_OCL4DNN_CONFIG_PATH parameter. [ INFO:0] Successfully initialized OpenCL cache directory: /root/.cache/opencv/4.0/opencl_cache/ [ INFO:0] Preparing OpenCL cache configuration for context: 32-bit--Broadcom--VideoCore_IV_GPU--0_4 OpenCL program build log: dnn/dummy Status -15: CL_COMPILE_PROGRAM_FAILURE -cl-no-subgroup-ifp [E] Sat Mar 9 14:55:07 2019: Errors in precompilation: [E] Sat Mar 9 14:55:07 2019: error: unknown argument: '-cl-no-subgroup-ifp'

OpenCL program build log: dnn/conv_layer_spatial Status -15: CL_COMPILE_PROGRAM_FAILURE -D TYPE=1 -D Dtype=float -D Dtype2=float2 -D Dtype4=float4 -D Dtype8=float8 -D Dtype16=float16 -D as_Dtype=as_float -D as_Dtype2=as_float2 -D as_Dtype4=as_float4 -D as_Dtype8=as_float8 -D KERNEL_WIDTH=3 -D KERNEL_HEIGHT=3 -D STRIDE_X=2 -D STRIDE_Y=2 -D DILATION_X=1 -D DILATION_Y=1 -D KERNEL_BASIC -cl-fast-relaxed-math -D ConvolveBasic=BASIC_k3x3_cn3_g1_s2x2_d1x1_b1_in256x256_p1x1_num1_M32_activ1_eltwise0_FP32_4_1_1_1 -D CHANNELS=3 -D APPLY_BIAS=1 -D OUTPUT_Z=32 -D ZPAR=1 -D FUSED_CONV_RELU=1 [W] Sat Mar 9 14:55:09 2019: Warnings in precompilation: [W] Sat Mar 9 14:55:09 2019: :1484:1: warning: null character ignored <U+0000> ^ 1 warning generated.

[E] Sat Mar 9 14:55:09 2019: Errors in precompilation: [E] Sat Mar 9 14:55:09 2019: ERROR: Invalid value (Producer: 'LLVM6.0.0svn' Reader: 'LLVM 3.9.1') /usr/bin/llvm-link: /usr/local/include/vc4cl-stdlib/VC4CLStdLib.bc: error: Corrupted bitcode /usr/bin/llvm-link: error loading file '/usr/local/include/vc4cl-stdlib/VC4CLStdLib.bc'

Failed to compile kernel: BASIC_k3x3_cn3_g1_s2x2_d1x1_b1_in256x256_p1x1_num1_M32_activ1_eltwise0_FP32_4_1_1_1, buildflags: -D TYPE=1 -D Dtype=float -D Dtype2=float2 -D Dtype4=float4 -D Dtype8=float8 -D Dtype16=float16 -D as_Dtype=as_float -D as_Dtype2=as_float2 -D as_Dtype4=as_float4 -D as_Dtype8=as_float8 -D KERNEL_WIDTH=3 -D KERNEL_HEIGHT=3 -D STRIDE_X=2 -D STRIDE_Y=2 -D DILATION_X=1 -D DILATION_Y=1 -D KERNEL_BASIC -cl-fast-relaxed-math -D ConvolveBasic=BASIC_k3x3_cn3_g1_s2x2_d1x1_b1_in256x256_p1x1_num1_M32_activ1_eltwise0_FP32_4_1_1_1 -D CHANNELS=3 -D APPLY_BIAS=1 -D OUTPUT_Z=32 -D ZPAR=1 -D FUSED_CONV_RELU=1, errmsg: [W] Sat Mar 9 14:55:09 2019: Warnings in precompilation: [W] Sat Mar 9 14:55:09 2019: :1484:1: warning: null character ignored <U+0000> ^ 1 warning generated.

[E] Sat Mar 9 14:55:09 2019: Errors in precompilation: [E] Sat Mar 9 14:55:09 2019: ERROR: Invalid value (Producer: 'LLVM6.0.0svn' Reader: 'LLVM 3.9.1') /usr/bin/llvm-link: /usr/local/include/vc4cl-stdlib/VC4CLStdLib.bc: error: Corrupted bitcode /usr/bin/llvm-link: error loading file '/usr/local/include/vc4cl-stdlib/VC4CLStdLib.bc'

It appears to have 2 issues, the command argument, -cl-no-subgroup-ifp, and 'corrupted bitcode' for the file VC4CLStdLib.bc

Can you advise on how to fix this?

pnduffy commented 5 years ago

I was able to get rid of the error by re-generating the bc file by re-installing VC4CL with 'make install' , and I re-build OpenCV DNN module with the unsupported command line args remove and the library runs.

Favi0 commented 5 years ago

you managed to run opencl as backend for DNN inference?

pnduffy commented 5 years ago

@Favi0 , I was able to get VC4CL to compile and run DNN kernels, but it randomly crashes, so at this point, no, it's not working, but I'm investigating.

Favi0 commented 5 years ago

@Favi0 , I was able to get VC4CL to compile and run DNN kernels, but it randomly crashes, so at this point, no, it's not working, but I'm investigating.

ohh well not bad then.. it's a start.. once they complete the full profile it might work :)