For numerical computing it would be interesting to schedule and keep track of Cuda kernels
on Nvidia GPUs with an interface similar to the CPU parallel API.
The focus is on task parallelism and dataflow parallelism (task graphs). Data parallelism (parallelFor) should be handled in the GPU kernel.
For numerical computing it would be interesting to schedule and keep track of Cuda kernels on Nvidia GPUs with an interface similar to the CPU parallel API.
The focus is on task parallelism and dataflow parallelism (task graphs). Data parallelism (
parallelFor
) should be handled in the GPU kernel.From this presentation https://developer.download.nvidia.com/CUDA/training/StreamsAndConcurrencyWebinar.pdf, we can use CudaEvent for synchronizing concurrent kernels: (note there seems to be a typo in the code it should be
At first glance an event seems to be fired when the stream is empty.