I am encountering an issue when mixing CUDA's Runtime API and Driver API within my project. While my project uses the Runtime API exclusively, I depend on an external library that uses cudawrappers and thus requires the Driver API. The issue arises when a Driver API context is created after the primary runtime context is already initialized.
When the Driver API context goes out of scope, subsequent Runtime API calls result in runtime errors. Specifically, I encounter the following error:
terminate called after throwing an instance of 'thrust::THRUST_200500_890_NS::system::system_error'
what(): CUDA free failed: cudaErrorInvalidValue: invalid argument
Aborted (core dumped)
Steps to Reproduce
The following minimal reproducible example demonstrates the issue:
#include <cuda.h>
#include <cudawrappers/cu.hpp>
#include <thrust/device_vector.h>
int main() {
cu::init();
auto vec = std::make_unique<thrust::device_vector<int>>(); // First call to the runtime API -> creates primary
auto device = std::make_unique<cu::Device>(0);
{
// auto context = device->primaryCtxRetain();
cu::Context context(0, *device);
context.setCurrent();
vec->resize(100);
} // context goes out of scope here
device.reset();
vec.reset(); // Fails with runtime error
}
Observations
Currently, it appears there is no way to instruct cudawrappers to utilize the primary context created by the Runtime API. However, I noticed the existence of the cu::Device::primaryCtxRetain() method, which seems like a potential solution. Unfortunately, it is not implemented in the current version of cudawrappers.
Solution
To solve the issue, I've implemented primaryCtxRetain as follows:
Context Device::primaryCtxRetain()
{
#if !defined(__HIP__)
CUcontext primary;
checkCudaCall(cuDevicePrimaryCtxRetain(&primary, _obj));
return {primary, *this}; // Call to the private Context constructor -> friended to cu::Device
#endif
}
Note: The implementation must be done after the cu::Context class. The primaryCtxRetain() declaration must also be inlined in the `cu::Device' class.
With this implementation, the following adjusted example works without any errors:
int main() {
cu::init();
auto vec = std::make_unique<thrust::device_vector<int>>();
auto device = std::make_unique<cu::Device>(0);
{
auto context = device->primaryCtxRetain();
vec->resize(100);
} // context goes out of scope here
device.reset();
vec.reset(); // Does not fail anymore
}
Questions
Why is the primaryCtxRetain() method not implemented in cudawrappers?
Is there an alternative solution already available within cudawrappers that I might have missed?
If there is no existing solution, would it be possible to integrate the primaryCtxRetain() method into the library? (I can open a pull request from my fork)
Description
I am encountering an issue when mixing CUDA's Runtime API and Driver API within my project. While my project uses the Runtime API exclusively, I depend on an external library that uses cudawrappers and thus requires the Driver API. The issue arises when a Driver API context is created after the primary runtime context is already initialized.
When the Driver API context goes out of scope, subsequent Runtime API calls result in runtime errors. Specifically, I encounter the following error:
Steps to Reproduce The following minimal reproducible example demonstrates the issue:
Observations
Currently, it appears there is no way to instruct cudawrappers to utilize the primary context created by the Runtime API. However, I noticed the existence of the
cu::Device::primaryCtxRetain()
method, which seems like a potential solution. Unfortunately, it is not implemented in the current version of cudawrappers.Solution To solve the issue, I've implemented
primaryCtxRetain
as follows:Note: The implementation must be done after the
cu::Context
class. TheprimaryCtxRetain()
declaration must also be inlined in the `cu::Device' class.With this implementation, the following adjusted example works without any errors:
Questions