SystemsGenetics / ACE

Accelerated Computational Engine (ACE) is a GPU-enabled framework to simplify creation of GPU-capable applications
http://SystemsGenetics.github.io/ACE
GNU General Public License v2.0
1 stars 1 forks source link

wait() method for OpenCL::CommandQueue #87

Closed bentsherman closed 5 years ago

bentsherman commented 5 years ago

Based on a discussion I had with @4ctrl-alt-del, CUDA provides a function (cudaStreamSynchronize()) to wait on a stream, which is equivalent to waiting on all events emitted by the stream. Apparently OpenCL can do the same thing by emiting an event which waits for all events in the command queue. It would be useful to have this feature through something like OpenCL::CommandQueue::wait(). Refer to Similarity::OpenCL::Worker and Similarity::CUDA::Worker in KINC for an example of how the code is simplified by waiting on a stream instead of waiting on every event.

4ctrl-alt-del commented 5 years ago

Added in commit 5288880262b22baf53cdd9cdd82185d2a26f71b6. You can do the honors of closing this if the added method is satisfactory to you.

bentsherman commented 5 years ago

Works great! It also cut the runtime of similarity opencl nearly in half for my small test. :O I wish I could profile the opencl code to see where the difference lies. Probably comes from the fact that the kernel launches and memory transfers can now be overlapped safely as in CUDA.