Closed antonmyagkov closed 5 years ago
Quite pervasive but it looks good,
Wouldn't it be better if it was done only for copy_async
? Because, honestly, what's the help if you use copy
which does not give you an event with out-of-order queue?
It still helps. if you have an OoO-Queue, you might still want a blocking write operation ("copy back to device after you are done computing") but this has to wait until all relevant operations are done. This is especially interesting in a multi-thread context where each thread enqueues items independently into the same queue to ensure that the device always has enough to work on. In this case, doing explicit synchronization of the queue is harmfull to performance, synchronous copy is not.
From: Jakub Szuppe [notifications@github.com] Sent: Wednesday, October 24, 2018 6:51 AM To: boostorg/compute Cc: Subscribed Subject: Re: [boostorg/compute] Added wait_list parameter to "copy" functions (#797)
Wouldn't it be better if it was done only for copy_async? Because, honestly, what's the help if you use copy which does not give you an event with out-of-order queue?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/boostorg/compute/pull/797#issuecomment-432511972, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AOWTBq4o04NKCSKKSaar1Fl_PgwbJ0tqks5un_G4gaJpZM4Xje6z.
Right, I just don't want to run into a situation where we have multiple different ways async API is done in Boost.Compute. That's why I am just wondering if it's maybe better to have only *_async
functions to work with OoO queues for now. However, I'm not saying 'no' to this change. @kylelutz What do you think?
Anyway, I would need to revive our CI and test it on a few platforms (I think full coverage is: Mac, NV, AMD, Intel, and POCL; IMHO can be limited to 1.2) to make sure it's all right. If anyone can help with that I'll be grateful.
Without a wait_list, synchronous operations are impossible to use safely in an OoO context. In OpenCL, all operations except copies are asynchronous and synchronous copies have to wait until those operations are done. This little detail is abstracted away in an in-order queue since there is an implicit ordering of events(i.e. wait_list=all currently unfinished operations). In OoO we have to make this ordering explicit, otherwise the copy takes place immediately once the memory bus is free.
It's not possible to use current "copy" functions with CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE option in command queue, because wait_list parameter is missing.