`pycuda._driver.LogicError: cuMemAlloc failed: context is destroyed` (caused by scikit-cuda or even CUDA itself?)

lebedov / scikit-cuda

Python interface to GPU-powered libraries

Other

986 stars 179 forks source link

Running the following in a clean python context throws pycuda._driver.LogicError: cuMemAlloc failed: context is destroyed on the last line:

import pycuda.autoinit

import numpy as np
import pycuda
import skcuda
import skcuda.fft as cufft

plan = cufft.Plan((2,2), np.complex64, np.complex64)

del plan # equivalent to `skcuda.cufft.cufftDestroy(plan.handle)`
#skcuda.cufft.cufftDestroy(plan.handle) # equivalent to `del plan`

pycuda.gpuarray.empty((2,2), np.float32)

Deleting the FFT plan in scikit-cuda destroys the pycuda context. This happens no matter if you use del plan or skcuda.cufft.cufftDestroy(plan.handle). So maybe CUDA itself is messing up here.

As long as skcuda.cufft.cufftDestroy() is not called, everything is ok. But what if I want to call it?

My system:

Debian GNU/Linux 11 (bullseye) (stable)
Python 3.9.2
CUDA installed following https://wiki.debian.org/NvidiaGraphicsDrivers
nvidia driver version 470.103.01
CUDA version 11.4
pyCUDA version: 2021.1
scikit-cuda version: 0.5.3

My mistake and an easy fix with good advice from the master (https://github.com/inducer/pycuda/discussions/356):

By using pycuda.autoinit, you're putting pycuda in charge of context management. That's not typically a good recipe for interacting with libraries that use the CUDA runtime API (like cuFFT, to my understanding). You might be better off retaining the "primary context" made by/for the runtime API and using that instead.

As stated above, the solution to my problem is retaining the primary context instead of letting pyCUDA create a new context. The easiest way to do this is via:

import pycuda.autoprimaryctx instead of import pycuda.autoinit

Voila, everything works now. See also the documentation and the code for pycuda.autoprimaryctx.

lebedov / scikit-cuda

`pycuda._driver.LogicError: cuMemAlloc failed: context is destroyed` (caused by scikit-cuda or even CUDA itself?) #330