Closed amontoison closed 6 months ago
The FORCE_MKL_FLUSH is used to make sure that the MKL task submitted to the SYCL queue has been dispatched. The SYCL runtime can temporarily hold a SYCL kernel without submitting it to the GPU driver (L0 driver in our case). The oneAPI.jl runtime works directly on L0 queue to synchronize between MKL SYCL function call and Julia statements. If a MKL kernel was held by the SYCL runtime and oneAPI.jl runtime calls zeQueueSynchronize() to wait for the MKL kernel to finish, they will be out of order. Hence, we call FORCE_MKL_FLUSH to make sure the SYCL kernel has been submitted to the L0 queue.
The FORCE_MKL_FLUSH(cmd) calls sycl::get_native<sycl::backend::ext_oneapi_level_zero(cmd) supposes to take a SYCL event returned by the MKL function as 'cmd'. If the MKL function doesn't return an event, it segfaults.
Thanks @pengtu!
The issue is how to be sure that the "usm" version and not the "buffer" version of a routine is used in the C interface?
For example with geqrf
here, we have the same parameters if we don't provide the argument events
: documentation of geqrf.
Should we provide an empty list {}
as a last parameter to the MKL routines to be sure that the usm
version is used and we can call FORCE_MKL_FLUSH
?
@amontoison: Indeed that the C wrapper might have been invoking the "buffer" version. Please try passing an empty list {} as the last argument to be sure that the 'usm' version is invoked.
@pengtu Should we only wrap the "usm" version if both version are available?
@pengtu Should we only wrap the "usm" version if both version are available?
Yes, we shall always call the "usm" version of the oneMKL since Julia directly allocate the device array without using SYCL buffer interface.
I don't understand why I have a segementation fault when I call some C functions that contains
__FORCE_MKL_FLUSH__
: https://github.com/JuliaGPU/oneAPI.jl/blob/master/deps/src/onemkl.cpp#L11-L12I don't have anymore a segmentation fault when I remove
__FORCE_MKL_FLUSH__
but it only concerns a few routines (geqrf
-- LAPACK andset_csr_data
-- SPARSE). Why don't we have the same behaviour with all routines? I use__FORCE_MKL_FLUSH__
after the routines that returnvoid
.