Closed RDambrosio016 closed 2 years ago
It doesn't seem like its possible to make this change without significant breakage, create_and_push
will be new
in the "new" context (same context struct, just different)
New Context API has been implemented and it will be part of cust/rust-cuda 0.3
Small tracking issue for sorting out context issues that are blocking cuBLAS and cuFFT work. The gist of it is that currently we use the "traditional" way of handling contexts per the driver API, which is such:
This very different from what cudart does:
cudaDeviceReset
which nukes the device and the primary context.This causes a good amount of issues when trying to interop with cudart, and is what is causing spurious segfaults in the cublas stuff i just pushed. What i presume is happing is:
However, the driver API also has primary context handling, aka what cudart does except explicit, basically:
cuDevicePrimaryCtxRetain
will retain a primary context handle for the device, this context is reference counted.cuDevicePrimaryCtxRelease
will release the context handle back to the driver, if this is the last handle, it will reset the context. Although presumably cudart holds on to it forever, so it will never be reset unless done explicitly.So my proposal is as such:
cust::context::legacy
, keeping theContext
name to avoid too much breakage, just switch it to using primary context handling.This would have a numerous amount of benefits:
However, it does retain the issue of "if a user calls deviceReset from cudart or the driver, this destroys the ability for anything to do cuda work", but i don't think there is a way to 100% solve that issue, legacy context handling can do this through just dropping the context, while primary contexts can just call deviceReset. So either way a user can nuke cuda contexts if they want to. Except that deviceReset is more explicit and will probably be unsafe in cust.
I will start working on this and probably releasing these changes in cust
0.3
.