support streams - Githubissues

zeratax commented 4 years ago

streams to execute kernels in async. Device should probably take care of context?

zeratax commented 4 years ago

you should consider #47, since both are async.

Device device;
CudaStream Sone, Stwo, Sthree;

CudaStream[] streams= {Sone, Stwo, Sthree};
Kernel[] kernels = {kernel1, kernel2, kernel3};

for(size_t i{0}; i < 3; ++i} {
   kernels[i].queueupload(args, device, streams[i]);
   kernels[i].queuelaunch(args, device, streams[i]); // i guess args could be implicitly known here?
   kernels[i].queuedownload(device, streams[i]);
}
// nonblocking, cpu can still execute while gpu is busy (but not download and upload??)
kernel.sync() // blocking, gpu done after this

execution order

this should be equivalent to async version 1.

I'm not sure how much upload and download need device. we need to be more explicit about context for this.

more info: https://devblogs.nvidia.com/how-overlap-data-transfers-cuda-cc/

LukasSiefke commented 4 years ago

Ich habe auf dem Branch mal ein bisschen angefangen mit Streams. So wird halt alles auf einem einzigen Stream asynchron ausgeführt. Aber irgendwie werden die upload-operationen und download-operationen trotzdem synchron durchgeführt (weiß irgendwie nicht wieso), weswegen das dann nicht wirklich viel bringt

zeratax / yacx

support streams #46