intel / intel-graphics-compiler

Other
597 stars 155 forks source link

OpenCL compiler issues for Intel 11th gen CPU? #195

Closed vm3tr1c1 closed 3 years ago

vm3tr1c1 commented 3 years ago

At the runtime of an opencv application which use the ocl4dnn module, a lot of 'simd size' errors are generated, like these ones:

OpenCV(ocl4dnn): The OpenCL compiler chose a simd size (16) that does not equal the size (8) kernel source required. Skip this kernel U_GEMM_LIKE_CONV_k3x3_cn3_g1_s2x2_d1x1_b1_in256x256_p1x1_num1_M32_activ1_eltwise0_FP32_5_1_8_32_SIMD8 OpenCV(ocl4dnn): The OpenCL compiler chose a simd size (32) that does not equal the size (16) kernel source required. Skip this kernel IDLF_k3x3_cn3_g1_s2x2_d1x1_b1_in256x256_p1x1_num1_M32_activ1_eltwise0_FP32_2_7_3_1_SIMD16

If the CPU is replaced with an i5-10600, without changing anything else, the errors disappear and the GPU code runs without any issue at all.

alalek commented 3 years ago

BTW, OpenCL kernel attributes for the second case are here. Host-side check is based on clGetKernelWorkGroupInfo() call with CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE argument.

@vm3tr1c1 Do you use OPENCV_OCL4DNN_FORCE_AUTO_TUNING=1 environment variable?

vm3tr1c1 commented 3 years ago

@vm3tr1c1 Do you use OPENCV_OCL4DNN_FORCE_AUTO_TUNING=1 environment variable?

I don't, but it doesn't seem to make any difference if I use it.

vm3tr1c1 commented 3 years ago

I'm not an expert in opencl, but please let me know how can I be of help (testing, debugging, whatever).

JacekDanecki commented 3 years ago

It looks like problem on IGC side, moving issue to https://github.com/intel/intel-graphics-compiler

alalek commented 3 years ago

Can confirm the similar behavior for i7-11700K on Win10 with 27.20.100.9127 driver. Will try to prepare minimal reproducer for this problem.

alalek commented 3 years ago

Investigation shows that there is incorrect condition in OpenCV code. We should not use clGetKernelWorkGroupInfo(CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE) check in such strict way. It is a hint and it should not block kernels execution. "Unblocked" kernels work fine without regressions.

This issue can be closed (no problem in IGC compiler). Related OpenCV fixes can be tracked here: https://github.com/opencv/opencv/issues/20559

vm3tr1c1 commented 3 years ago

@alalek I can confirm that the OpenCV patch resolves the problem. Thank you!