Closed hominhquan closed 9 years ago
Hi,
By design, clBLAS API are asynchronous. If you look at clBLAS.h you can see it is up to the user to pass in an event pointer to the library. Thus it is up to the user to "wait" for the event after the API call if needed.
When running kernels in sequence, if "CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE" is not set at clCreateCommandQueue (https://www.khronos.org/registry/cl/sdk/1.2/docs/man/xhtml/clCreateCommandQueue.html), each kernel will wait until previous kernels are finished before being executed.
Hi, Thanks for your response. Topic closed
Hi, In solution_seq.c::enqueueKernel() lines 186-188, where nowait = 1 and needExecTime = 0, then we got two if-conditions in clkern.c::launchClKernel() lines 104-120 to call the clWaitForEvents() function.
The problem is the (nowait, needExecTime) values set above seems make the execution not to jump to any of these two if-condition, hence the event is never waited and may cause race-condition.
Furthermore, I don't quiet understand the semantic of 'nowait' here, when we run kernels in sequence, they may need to be synchronized sometimes (even often) to update results ?
Regards Quan