openvinotoolkit / openvino_contrib

Repository for OpenVINO's extra modules
Apache License 2.0
105 stars 145 forks source link

arm_plugin OpenCL supoprt? #480

Closed mzwtjp closed 1 year ago

mzwtjp commented 1 year ago

I have ARM target where OpenCL exposes GPU, and I wonder what is the shortest way of using it so that I can run my demos faster, even without NCS2(MYRIAD) devices. Currently arm_plugin (CPU) is the only option, but it is very slow compared to NCS2.

arm_plugin uses ARM Compute Library, and ACL could be configured to be using OpenCL (via opencl=1 flag to its scons build command), but it does not mean arm_plugin takes advantage of OpenCL. I can see to use OpenCL arm_plugin code needs to be modifier?

What puzzles me now is, if ever, will it take a lot of effort to make it use OpenCL and if it worths. (if it can, how?)

Alternatively, I can see intel_gpu plugin uses OpenCL (with intel platform specific extentions). If I port it to ARM my goal (using OpenCL for inference on ARM) is archieved? (if it ever can, how and with how much efforts?)

If anyone has suggestions or comment, I really appreciate.

Also if there is any alternative to these (fix arm_plugin or port intel_gpu), I also want to hear what it is.

vladimir-paramuzov commented 1 year ago

Hi @mzwtjp,

intel_gpu plugin may be used for other GPUs with OCL support indeed. Here you can find a very early draft of a patch that introduces more careful intel extensions usage: https://github.com/openvinotoolkit/openvino/pull/13926

This patch introduces 2 key things:

Essentially, further work will require checking existing kernels one by one and implementing proper extension checks and optionally some minor kernel's code refactoring to enable emulation.

We haven't tried to run anything on ARM GPUs yet, so we'd really appreciate any feedback and contributions

mzwtjp commented 1 year ago

Thank you very much for your comment.

I tried building intel_gpu for my ARM target, and succeeded to make it recognized by inference engine (listed in available devices list). I can see unerlaying OpenCL codes are actually called.

Learned IE is reading plugins.xml and doing checks on each plugin listed.

Of course, I am seeing errors related to reorder or implicit declaration of get_sub_group_local_id, etc. when try to inference. (I just commented out or noop'ed some parts of the plugin code just to make it compilable, so this is expected).

Obviously I need to look into each of these or others to make it work properly. Currently I am lacking knowledge of OpenCL kernels so it may take time though.

Closing this for now, since I have at least one direction.

ilya-lavrenov commented 1 year ago

@mzwtjp please, feel free to make some changes in GPU OpenCL plugin if you managed to make some fixes to issues you've described.