ROCm / MIVisionX

MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimized open-source implementation of the Khronos OpenVX™ and OpenVX™ Extensions.
https://rocm.docs.amd.com/projects/MIVisionX/en/latest/
MIT License
185 stars 72 forks source link

OpenVX Framework - Kernel execution on CPU #173

Closed bogdanul2003 closed 1 month ago

bogdanul2003 commented 4 years ago

Hi,

I was going through the code to understand a bit the implementation and how kernels get executed in parallel on CPU in case the graph has nodes that can be executed in parallel. Am I wrong or all nodes/kernels of a graph get executed in serial fashion on single core? At least this is what I understand when looking at agoExecuteGraph() function. Maybe for OpenCL the situation is different.

kiritigowda commented 4 years ago

@bogdanul2003 the nodes in the graph execute serially on a single core. The nodes themselves can use the available cores to execute parallel computation. OpenCL nodes occupy the required number of CUs when launched on a GPU.

rrawther commented 4 years ago

@bogdanul2003 just to add on what @kiritigowda, multiple sub_graphs can be created to run them in different cores. OpenCL always uses parallel threads on GPU

bogdanul2003 commented 4 years ago

One thing to mention, at the moment my workload is CPU based only. If I understand correctly, I need to compile the framework with OpenCL support so that I can get nodes executed in parallel on different CPU cores ? My question, I saw that I forgot to mention this, was more related to the case when you compile without opencl support. @rrawther is it possible to run sub_graphs on different coreas without opencl? I couldn't figure out who decides which sub_graphs can be executed on different cores.

rrawther commented 4 years ago

@bogdanul2003 : Currently OpenCL implementation is only targeted for GPU only. We don't have an OpenCL implementation which gets executed in parallel on different CPU cores. Are you running on Windows or Linux? We have multithreading support for windows assuming you have separate graphs created for nodes which has to run in parallel. Because of data dependency most OpenVX graphs are executed sequentially in our current implementaion

bogdanul2003 commented 4 years ago

Thanks @rrawther . I thought that it can figure out which nodes can be executed in parallel depending on how you build your graph. Do you plan to add this feature also to the framework for CPU only workloads? Do you know if other implementations of OpenVX (Nvidia or Intel) offer this kind of optimization ?

rrawther commented 4 years ago

@bogdanul2003 Once the nodes are submitted to GPU, they can run in parallel provided no data dependancy. OpenVX framework checks if the input data is ready before a node is executed. We don't have much insight into Intel or NVidia. But We will be adding enhancements to our implementation in future.

bogdanul2003 commented 4 years ago

@rrawther thanks for the clarification. Can we keep this ticket open until this feature is added ?

rrawther commented 1 month ago

@kiritigowda: Can we close this issue as we are not going to support this feature in OpenVX framework for parallel CPU node execution.

rrawther commented 1 month ago

Closing this since we won't be adding this feature support in ROCm OpenVX framework. The users can create multiple OpenVX graphs on multiple threads to achieve parallelism on CPU nodes.