Rust-GPU / Rust-CUDA

Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.
Apache License 2.0
3.12k stars 120 forks source link

The current way of handling context is fundamentally incompatible with the Runtime API #21

Closed RDambrosio016 closed 2 years ago

RDambrosio016 commented 2 years ago

Small tracking issue for sorting out context issues that are blocking cuBLAS and cuFFT work. The gist of it is that currently we use the "traditional" way of handling contexts per the driver API, which is such:

This very different from what cudart does:

This causes a good amount of issues when trying to interop with cudart, and is what is causing spurious segfaults in the cublas stuff i just pushed. What i presume is happing is:

However, the driver API also has primary context handling, aka what cudart does except explicit, basically:

So my proposal is as such:

This would have a numerous amount of benefits:

However, it does retain the issue of "if a user calls deviceReset from cudart or the driver, this destroys the ability for anything to do cuda work", but i don't think there is a way to 100% solve that issue, legacy context handling can do this through just dropping the context, while primary contexts can just call deviceReset. So either way a user can nuke cuda contexts if they want to. Except that deviceReset is more explicit and will probably be unsafe in cust.

I will start working on this and probably releasing these changes in cust 0.3.

RDambrosio016 commented 2 years ago

It doesn't seem like its possible to make this change without significant breakage, create_and_push will be new in the "new" context (same context struct, just different)

RDambrosio016 commented 2 years ago

New Context API has been implemented and it will be part of cust/rust-cuda 0.3