Cuda scheduling - Githubissues

For numerical computing it would be interesting to schedule and keep track of Cuda kernels on Nvidia GPUs with an interface similar to the CPU parallel API.

The focus is on task parallelism and dataflow parallelism (task graphs). Data parallelism (parallelFor) should be handled in the GPU kernel.

From this presentation https://developer.download.nvidia.com/CUDA/training/StreamsAndConcurrencyWebinar.pdf, we can use CudaEvent for synchronizing concurrent kernels: (note there seems to be a typo in the code it should be

cudaStreamWaitEvent ( stream, event );       // wait for event in stream1

At first glance an event seems to be fired when the stream is empty.

mratsim / weave

Cuda scheduling #133