oneapi-src / level-zero

oneAPI Level Zero Specification Headers and Loader
https://spec.oneapi.com/versions/latest/elements/l0/source/index.html
MIT License
208 stars 90 forks source link

[Question] Blocking calls for data transfers and kernel launch? #60

Open jjfumero opened 3 years ago

jjfumero commented 3 years ago

In the spec, it seems data transfers using this function

zeCommandListAppendMemoryCopy

are not blocking. Is there any variant for blocking calls? Or is there any equivalent to the OpenCL call clEnqueue{Read/Write}Buffer with CL_TRUE to indicate a blocking call?

Similar to the kernel launch, is there any way to specify a blocking call?

What I have found is to close a command lists, and then launch all pending command within the list after each data transfer or kernel launch. For example, by running the following sequence:

zeCommandListAppendMemoryCopy( .. )
zeCommandListClose( .. )
zeCommandQueueExecuteCommandLists ( .. )
zeCommandQueueSynchronize ( ..) 
zeCommandListReset ( .. )

but is there any other way to get blocking calls?

bmyates commented 3 years ago

We don't have an exact equivalent of clEnqueue{Read/Write}Buffer , but there are a few different ways to get the behavior you are looking for. Have you tried immediate command lists in synchronous mode? Something like...

ze_command_queue_desc_t desc = {};
desc.mode = ZE_COMMAND_QUEUE_MODE_SYNCHRONOUS 
zeCommandListCreateImmediate(..., &desc, &hCommandList);
zeCommandListAppendMemoryCopy(... hCommandList)
jjfumero commented 3 years ago

That might work for me. Looking at the spec, immediate command lists are used for low latency, so I think they are even a better fit for what I am looking for. So, with the immediate command lists, using your example:

ze_command_queue_desc_t desc = {};
desc.mode = ZE_COMMAND_QUEUE_MODE_SYNCHRONOUS 
zeCommandListCreateImmediate(..., &desc, &hCommandList);
zeCommandListAppendMemoryCopy(... hCommandList)

// >>>>>>>  At this point of execution, is there any guarantee that the copy is finished?  

Just to give you the context I am working on, I am writing a wrapper for Java, and I need to do blocking calls, otherwise, the Java GC can/might move the objects before the actual copy (data transfer).

bmyates commented 3 years ago

Yeah, in synchronous mode, zeCommandListAppendMemoryCopy(... hCommandList) should block until execution completes.

Per the spec:

ZE_COMMAND_QUEUE_MODE_SYNCHRONOUS = 1 Device execution always completes immediately on execute; Host thread is blocked using wait on implicit synchronization object