clMathLibraries / clBLAS

a software library containing BLAS functions written in OpenCL
Apache License 2.0
843 stars 237 forks source link

clkern.c::launchClKernel()::clWaitForEvents() never called #86

Closed hominhquan closed 9 years ago

hominhquan commented 9 years ago

Hi, In solution_seq.c::enqueueKernel() lines 186-188, where nowait = 1 and needExecTime = 0, then we got two if-conditions in clkern.c::launchClKernel() lines 104-120 to call the clWaitForEvents() function.

The problem is the (nowait, needExecTime) values set above seems make the execution not to jump to any of these two if-condition, hence the event is never waited and may cause race-condition.

status = clEnqueueNDRangeKernel(...);
if ((status == CL_SUCCESS) && !kernDesc->nowait) {
       status = clWaitForEvents(1, kernDesc->event);
}
...
if ((status == CL_SUCCESS) && kernDesc->needExecTime && kernDesc->event) {
       if (kernDesc->nowait) {
            status = clWaitForEvents(1, kernDesc->event);
             ...
       }
}

Furthermore, I don't quiet understand the semantic of 'nowait' here, when we run kernels in sequence, they may need to be synchronized sometimes (even often) to update results ?

Regards Quan

TimmyLiu commented 9 years ago

Hi,

By design, clBLAS API are asynchronous. If you look at clBLAS.h you can see it is up to the user to pass in an event pointer to the library. Thus it is up to the user to "wait" for the event after the API call if needed.

When running kernels in sequence, if "CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE" is not set at clCreateCommandQueue (https://www.khronos.org/registry/cl/sdk/1.2/docs/man/xhtml/clCreateCommandQueue.html), each kernel will wait until previous kernels are finished before being executed.

hominhquan commented 9 years ago

Hi, Thanks for your response. Topic closed