Closed vade closed 4 years ago
Also, im aware I can't expect you to help resolve a remote kernel panic, im just looking for any other place info to guide my debugging.
Seems like, it's problem with your kernel + nvidia-driver-440 + cuda-10.2. I have similar system: Ubuntu 18.04.3 (kernel 5.3.0-28) cuda 10.2.89-1 nvidia-driver-440 440.33.01-0ubuntu1 And everything is fine.
Thanks, thats great to know. Ill see if I can find any other issues.
Hi, I have similar problem:
error in correlation_forward_cuda_kernel: no kernel image is available for execution on the device Traceback (most recent call last): File "demo_MiddleBury_slowmotion.py", line 126, in y_s,offset,filter = model(torch.stack((X0, X1),dim = 0))
pytorch = 1.3.1 NVIDIA GPU = Tesla V100 CUDA Version: 10.2
https://github.com/baowenbo/DAIN/issues/44#issuecomment-589483416
Here you go, Just follow the colab posted on the comments and modify it according to my comment.
Hello - firstly, thanks for this and your great documentation. Much appreciated.
Im using Ubuntu 18.0.4 LTS, Cuda 10.2, Nvidia 4.40 drivers and a single Titan X
Ive followed the readme, installed the dependencies in a virtual envs, compiled the extensions, and am able to run the demo - however, after a few seconds the demo crashes and kernal panics the entire system.
I've attempted to edit both extension 's NVCC flags, as per the helpful note in the documentation, but to no avail.
However, that also kernel panics the machine.
I am able to monitor GPU memory usage right before the crash and am able to see pytorch allocating GPU memory, but It appears to go to the max, then the system dies.
Are there other specific hardware requirements for this code base?