We should have a targeted semaphore test that mixes semaphores, barriers, and out-of-order queues, since this is likely to be a common pattern for applications using semaphores in an out-of-order queue.
The rough flow will look something like:
// Producers:
// A producer operation can can be any enqueue operation, such as a kernel or a memory fill.
clEnqueueXXX(producer_queue, ...);
// There can be more than one independent producer operations.
clEnqueueXXX(producer_queue, ...);
// This barrier requires all producer operations to be complete:
clEnqueueBarrierWithWaitList(producer_queue);
// The semaphore cannot be signaled until the barrier is complete:
clEnqueueSignalSemaphore(producer_queue, semaphore);
// Note: the consumer_queue can be a separate queue or the same as the producer_queue!
// If the consumer_queue and the producer_queue are the same then there needs to be
// an explicit dependency between the semaphore signal and the semaphore wait, which
// could be an event dependency or another barrier.
// Consumers:
// Wait for the producers to signal completion via the semaphore:
clEnqueueWaitSemaphore(consumer_queue, semaphore);
// This barrier ensures all consumers cannot start until the semaphore wait is complete:
clEnqueueBarrierWithWaitList(consumer_queue);
// Now consumers can execute, and should see the producer results.
clEnqueueXXX(consumer_queue, ...);
clEnqueueXXX(consumer_queue, ...);
Additional notes:
Suggest testing multiple independent producer and consumer operations, two each should be fine.
Suggest testing the same out-of-order queue for both producers and consumers, and separate producer and consumer out-of-order queues.
For the same out-of-order queue test, it's probably fine to test either a barrier or an explicit event dependency between the semaphore signal and wait.
See related OpenCL spec issue discussion: https://github.com/KhronosGroup/OpenCL-Docs/issues/1178#issuecomment-2148210553
We should have a targeted semaphore test that mixes semaphores, barriers, and out-of-order queues, since this is likely to be a common pattern for applications using semaphores in an out-of-order queue.
The rough flow will look something like:
Additional notes: