Closed fvanmaele closed 1 year ago
tl;dr: cuda::kernel::get(my_device, my_plain_vanilla_kernel_function)
is known to work; see the execution control example.
The effect of that big change on the wrap()
method is that it now requires a CUDA context. People who are used to the runtime API don't know about context - and that's ok - but wrap()
is the lowest-level non-detail_
kind of API call I offer, and it must really "not do anything", so it needs the context. Under the hood, the CUDA runtime API uses something called the "primary context" on each device.
So, your options are:
auto my_pcontext = my_device.primary_context()
, keeping that object alive, and using my_pcontext.handle()
does the trick.cuda::kernel::get(const device_t& device, KernelFunctionPtr function_ptr)
where you only pass a device and a kernel function. That takes care of the context magic for you. Note: It takes a device, not a device id!Assuming the question is answered to @fvanmaele 's satisfaction... please comment again if that's not the case.
I'm having some old code which uses the old API of
cuda::kernel::wrap
, which was modified in commit bc53844ef215dc974c02adca35240ad5d83c68af. The call looks as follows:kernel::reduce
above is avoid
function. See: https://mp-force.ziti.uni-heidelberg.de/fvanmaele/tridigpu/-/blob/master/include/tridigpu/reduction.h#L58-80 and https://mp-force.ziti.uni-heidelberg.de/fvanmaele/tridigpu/-/blob/master/include/tridigpu/reduction.h#L105How can I adopt this to the new interface? Unfortunately I am not familiar with this project, and I couldn't find a migration guide for the API either. Changing to
cuda::kernel::get
was unsuccessful.