In SHOC 1.1.1, we face a failure in scan. Below is the code segment (from line #325). It applies a BLOCKING enqueueWrite following by clGetEventProfilingInfo. The assumption is that the blocking write completes (i.e., the event returned by &evTransfer.CLEvent() is set to “CL_COMPLETE”) when the call to clEnqueueWrite returns.
However, according to the 1.1 (same in 1.2) the behavior of a BLOCKING clEnqueueWrite API is not completely synchronous: “If blocking_write is CL_TRUE, the OpenCL implementation copies the data referred to by ptr and enqueues the write operation in the command-queue. The memory pointed to by ptr can be reused by the application after the clEnqueueWriteBuffer call returns.” (OpenCL spec 1.1 v45, page 62). That is, the enqueueWrite event is set to CL_COMPLETE only after the data is written to the device which may be after the clEnqueueWrite API returns. Hence, you need to wait on the event (clFinish or clWaitForEvent) before accessing the profiling info.
In SHOC 1.1.1, we face a failure in scan. Below is the code segment (from line #325). It applies a BLOCKING enqueueWrite following by clGetEventProfilingInfo. The assumption is that the blocking write completes (i.e., the event returned by &evTransfer.CLEvent() is set to “CL_COMPLETE”) when the call to clEnqueueWrite returns.
However, according to the 1.1 (same in 1.2) the behavior of a BLOCKING clEnqueueWrite API is not completely synchronous: “If blocking_write is CL_TRUE, the OpenCL implementation copies the data referred to by ptr and enqueues the write operation in the command-queue. The memory pointed to by ptr can be reused by the application after the clEnqueueWriteBuffer call returns.” (OpenCL spec 1.1 v45, page 62). That is, the enqueueWrite event is set to CL_COMPLETE only after the data is written to the device which may be after the clEnqueueWrite API returns. Hence, you need to wait on the event (clFinish or clWaitForEvent) before accessing the profiling info.
Thanks, --Yariv