error in correlation_forward_cuda_kernel: no kernel image is available for execution on the device

laaRaa commented 6 years ago

Hi,

I installed all the dependencies and followed the steps that are listed to install and run flownet but I have the following error when using "run_a_pair.py" . Any ideas?

The parameters of the system are the following: CUDA: release 9.1, V9.1.85 PYTORCH: 0.4.1 UBUNTU: Ubuntu 18.04 LTS PYTHON: 3.6.5

Thanks!

python3 run_a_pair.py error in correlation_forward_cuda_kernel: no kernel image is available for execution on the device Traceback (most recent call last): File "run_a_pair.py", line 31, in result = net(im).squeeze() File "/home/raad/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, kwargs) File "/home/raad/flownet2-pytorch_python3/models.py", line 118, in forward flownetc_flow2 = self.flownetc(x)[0] File "/home/raad/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, *kwargs) File "/home/raad/flownet2-pytorch_python3/networks/FlowNetC.py", line 86, in forward out_corr = self.corr(out_conv3a, out_conv3b) # False File "/home/raad/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(input, kwargs) File "/home/raad/flownet2-pytorch_python3/networks/correlation_package/correlation.py", line 59, in forward result = CorrelationFunction(self.pad_size, self.kernel_size, self.max_displacement,self.stride1, self.stride2, self.corr_multiply)(input1, input2) File "/home/raad/flownet2-pytorch_python3/networks/correlation_package/correlation.py", line 27, in forward self.pad_size, self.kernel_size, self.max_displacement,self.stride1, self.stride2, self.corr_multiply) RuntimeError: CUDA call failed (correlation_forward_cuda at correlation_cuda.cc:79) frame #0: + 0x135a7 (0x7f64fe8845a7 in /home/raad/local/lib/python3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_cuda.cpython-36m-x86_64-linux-gnu.so) frame #1: + 0x102ef (0x7f64fe8812ef in /home/raad/local/lib/python3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_cuda.cpython-36m-x86_64-linux-gnu.so) frame #2: _PyCFunction_FastCallKeywords + 0x26b (0x4c549b in python3) frame #3: python3() [0x54ffe4] frame #4: _PyEval_EvalFrameDefault + 0x362f (0x5546cf in python3) frame #5: python3() [0x54f0e8] frame #6: _PyFunction_FastCallDict + 0x2a2 (0x558ef2 in python3) frame #7: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3) frame #8: PyObject_Call + 0x3e (0x459eee in python3) frame #9: THPFunction_do_forward(THPFunction, _object) + 0x2ad (0x7f652c26fc3d in /home/raad/.local/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so) frame #10: PyCFunction_Call + 0xbd (0x4c517d in python3) frame #11: PyObject_Call + 0x3e (0x459eee in python3) frame #12: python3() [0x4e0e9b] frame #13: _PyObject_FastCallDict + 0xa3 (0x45a0e3 in python3) frame #14: python3() [0x54fd37] frame #15: _PyEval_EvalFrameDefault + 0x362f (0x5546cf in python3) frame #16: python3() [0x54f0e8] frame #17: _PyFunction_FastCallDict + 0x2a2 (0x558ef2 in python3) frame #18: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3) frame #19: PyObject_Call + 0x3e (0x459eee in python3) frame #20: _PyEval_EvalFrameDefault + 0x1ba9 (0x552c49 in python3) frame #21: python3() [0x54fbe1] frame #22: _PyFunction_FastCallDict + 0x1c9 (0x558e19 in python3) frame #23: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3) frame #24: PyObject_Call + 0x3e (0x459eee in python3) frame #25: python3() [0x4e0e9b] frame #26: _PyObject_FastCallDict + 0xa3 (0x45a0e3 in python3) frame #27: python3() [0x54fd37] frame #28: _PyEval_EvalFrameDefault + 0x362f (0x5546cf in python3) frame #29: python3() [0x54f0e8] frame #30: _PyFunction_FastCallDict + 0x2a2 (0x558ef2 in python3) frame #31: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3) frame #32: PyObject_Call + 0x3e (0x459eee in python3) frame #33: _PyEval_EvalFrameDefault + 0x1ba9 (0x552c49 in python3) frame #34: python3() [0x54fbe1] frame #35: _PyFunction_FastCallDict + 0x1c9 (0x558e19 in python3) frame #36: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3) frame #37: PyObject_Call + 0x3e (0x459eee in python3) frame #38: python3() [0x4e0e9b] frame #39: _PyObject_FastCallDict + 0xa3 (0x45a0e3 in python3) frame #40: python3() [0x54fd37] frame #41: _PyEval_EvalFrameDefault + 0x362f (0x5546cf in python3) frame #42: python3() [0x54f0e8] frame #43: _PyFunction_FastCallDict + 0x2a2 (0x558ef2 in python3) frame #44: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3) frame #45: PyObject_Call + 0x3e (0x459eee in python3) frame #46: _PyEval_EvalFrameDefault + 0x1ba9 (0x552c49 in python3) frame #47: python3() [0x54fbe1] frame #48: _PyFunction_FastCallDict + 0x1c9 (0x558e19 in python3) frame #49: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3) frame #50: PyObject_Call + 0x3e (0x459eee in python3) frame #51: python3() [0x4e0e9b] frame #52: _PyObject_FastCallDict + 0xa3 (0x45a0e3 in python3) frame #53: python3() [0x54fd37] frame #54: _PyEval_EvalFrameDefault + 0x362f (0x5546cf in python3) frame #55: python3() [0x54fbe1] frame #56: PyEval_EvalCode + 0x23 (0x550b93 in python3) frame #57: PyRun_FileExFlags + 0x169 (0x42b519 in python3) frame #58: PyRun_SimpleFileExFlags + 0xe5 (0x42b705 in python3) frame #59: Py_Main + 0xccb (0x441fcb in python3) frame #60: main + 0x184 (0x421ff4 in python3) frame #61: __libc_start_main + 0xe7 (0x7f65408fab97 in /lib/x86_64-linux-gnu/libc.so.6) frame #62: _start + 0x2a (0x4220aa in python3)

cuuupid commented 6 years ago

Also getting this, using the Flownet2 models in vid2vid.

error in correlation_forward_cuda_kernel: no kernel image is available for execution on the device
Traceback (most recent call last):
  File "train.py", line 273, in <module>
    train()
  File "train.py", line 105, in train
    flow_ref, conf_ref = flowNet(real_B, real_B_prev)  # reference flows and confidences
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/vid2vid/models/flownet.py", line 33, in forward
    flow, conf = self.compute_flow_and_conf(input_A, input_B)
  File "/home/ubuntu/vid2vid/models/flownet.py", line 50, in compute_flow_and_conf
    flow1 = self.flowNet(data1)
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/vid2vid/models/flownet2_pytorch/models.py", line 126, in forward
    flownetc_flow2 = self.flownetc(x)[0]
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/vid2vid/models/flownet2_pytorch/networks/FlowNetC.py", line 86, in forward
    out_corr = self.corr(out_conv3a, out_conv3b) # False
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/vid2vid/models/flownet2_pytorch/networks/correlation_package/correlation.py", line 59, in forward
    result = CorrelationFunction(self.pad_size, self.kernel_size, self.max_displacement,self.stride1, self.stride2, self.corr_multiply)(input1, input2)
  File "/home/ubuntu/vid2vid/models/flownet2_pytorch/networks/correlation_package/correlation.py", line 27, in forward
    self.pad_size, self.kernel_size, self.max_displacement,self.stride1, self.stride2, self.corr_multiply)
RuntimeError: CUDA call failed (correlation_forward_cuda at correlation_cuda.cc:79)
frame #0: <unknown function> + 0x140f8 (0x7f15885b40f8 in /home/ubuntu/.local/lib/python3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #1: <unknown function> + 0x1433e (0x7f15885b433e in /home/ubuntu/.local/lib/python3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #2: <unknown function> + 0x107e1 (0x7f15885b07e1 in /home/ubuntu/.local/lib/python3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_cuda.cpython-36m-x86_64-linux-gnu.so)
<omitting python frames>
frame #10: THPFunction_do_forward(THPFunction*, _object*) + 0x2ad (0x7f15c17a7f8d in /home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)

mawah commented 6 years ago

I am getting an analogous error message when running inference using main.py. My system is somewhat different:

CUDA V9.0.176
cuDNN V7.3.0
NVIDIA K80 GPU
PyTorch 0.4.1.post2
Python 3.6.5
Ubuntu 16.04.4 LTS

The install and execution is dockerized and I'm happy to share the image if helpful.

# python main.py --inference --model FlowNet2 --save_flow --inference_dataset MpiSintelClean \
> --inference_dataset_root /data/MPISintel/training \
> --resume /workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar
Parsing Arguments
  [0.029s] batch_size: 8
  [0.030s] crop_size: [256, 256]
  [0.030s] fp16: False
  [0.030s] fp16_scale: 1024.0
  [0.030s] gradient_clip: None
  [0.030s] inference: True
  [0.030s] inference_batch_size: 1
  [0.030s] inference_dataset: MpiSintelClean
  [0.030s] inference_dataset_replicates: 1
  [0.030s] inference_dataset_root: /data/MPISintel/training
  [0.030s] inference_n_batches: -1
  [0.030s] inference_size: [-1, -1]
  [0.030s] log_frequency: 1
  [0.030s] loss: L1Loss
  [0.030s] model: FlowNet2
  [0.030s] model_batchNorm: False
  [0.030s] model_div_flow: 20.0
  [0.030s] name: run
  [0.030s] no_cuda: False
  [0.030s] number_gpus: 1
  [0.030s] number_workers: 8
  [0.030s] optimizer: Adam
  [0.030s] optimizer_amsgrad: False
  [0.030s] optimizer_betas: (0.9, 0.999)
  [0.030s] optimizer_eps: 1e-08
  [0.030s] optimizer_lr: 0.001
  [0.030s] optimizer_weight_decay: 0
  [0.030s] render_validation: False
  [0.030s] resume: /workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar
  [0.030s] rgb_max: 255.0
  [0.030s] save: ./work
  [0.030s] save_flow: True
  [0.030s] schedule_lr_fraction: 10
  [0.030s] schedule_lr_frequency: 0
  [0.030s] seed: 1
  [0.030s] skip_training: False
  [0.030s] skip_validation: False
  [0.030s] start_epoch: 1
  [0.030s] total_epochs: 10000
  [0.030s] train_n_batches: -1
  [0.030s] training_dataset: MpiSintelFinal
  [0.030s] training_dataset_replicates: 1
  [0.030s] training_dataset_root: ./MPI-Sintel/flow/training
  [0.030s] validation_dataset: MpiSintelClean
  [0.030s] validation_dataset_replicates: 1
  [0.030s] validation_dataset_root: ./MPI-Sintel/flow/training
  [0.030s] validation_frequency: 5
  [0.030s] validation_n_batches: -1
  [0.032s] Operation finished

Source Code
  Current Git Hash: b'532613d4fa46e544ddc309a8aa9e6b65dc91af21'

Initializing Datasets
  [0.050s] Inference Dataset: MpiSintelClean
  [0.117s] Inference Input: [3, 2, 384, 1024]
  [0.353s] Inference Targets: [2, 384, 1024]
  [0.354s] Operation finished

Building FlowNet2 model
  [5.002s] Effective Batch Size: 8
  [5.004s] Number of parameters: 162518834
  [5.004s] Initializing CUDA
  [7.209s] Parallelizing
  [7.211s] Loading checkpoint '/workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar'
  [7.670s] Loaded checkpoint '/workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar' (at epoch 0)
  [7.670s] Initializing save directory: ./work
  [7.672s] Operation finished

Initializing Adam Optimizer
  [0.001s] amsgrad = False (<class 'bool'>)
  [0.001s] weight_decay = 0 (<class 'int'>)
  [0.001s] eps = 1e-08 (<class 'float'>)
  [0.001s] betas = (0.9, 0.999) (<class 'tuple'>)
  [0.001s] lr = 0.001 (<class 'float'>)
  [0.001s] Operation finished

Overall Progress:   0%|                                                       |              0/1 [00:00<?, ?it/s]
Inferencing :   0%| ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).
error in correlation_forward_cuda_kernel: no kernel image is available for execution on the device
Traceback (most recent call last):
  File "main.py", line 403, in <module>
    stats = inference(args=args, epoch=epoch - 1, data_loader=inference_loader,              model=model_and_loss, offset=offset)
  File "main.py", line 367, in inference
    losses, output = model(data[0], target[0], inference=True)
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "main.py", line 170, in forward
    output = self.model(data)
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/workspace/flownet2-pytorch/models.py", line 118, in forward
    flownetc_flow2 = self.flownetc(x)[0]
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/workspace/flownet2-pytorch/networks/FlowNetC.py", line 86, in forward
    out_corr = self.corr(out_conv3a, out_conv3b) # False
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/workspace/flownet2-pytorch/networks/correlation_package/correlation.py", line 59, in forward
    result = CorrelationFunction(self.pad_size, self.kernel_size, self.max_displ             acement,self.stride1, self.stride2, self.corr_multiply)(input1, input2)
  File "/workspace/flownet2-pytorch/networks/correlation_package/correlation.py", line 27, in forward
    self.pad_size, self.kernel_size, self.max_displacement,self.stride1, self.stride2, self.corr_multiply)
RuntimeError: CUDA call failed (correlation_forward_cuda at correlation_cuda.cc:79)
frame #0: <unknown function> + 0x140c8 (0x7f6e498160c8 in /root/anaconda3/lib/py             thon3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_             cuda.cpython-36m-x86_64-linux-gnu.so)
frame #1: <unknown function> + 0x1430e (0x7f6e4981630e in /root/anaconda3/lib/py             thon3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_             cuda.cpython-36m-x86_64-linux-gnu.so)
frame #2: <unknown function> + 0x107b1 (0x7f6e498127b1 in /root/anaconda3/lib/py             thon3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_             cuda.cpython-36m-x86_64-linux-gnu.so)
frame #3: _PyCFunction_FastCallDict + 0x154 (0x55e4403a2b94 in ./work)
frame #4: <unknown function> + 0x19e67c (0x55e44043267c in ./work)
frame #5: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #6: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work)
frame #7: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #8: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #9: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #10: THPFunction_do_forward(THPFunction*, _object*) + 0x2ad (0x7f6e73b61fbd in /root/anaconda3/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #11: PyCFunction_Call + 0x5f (0x55e4403a598f in ./work)
frame #12: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #13: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work)
frame #14: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work)
frame #15: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work)
frame #16: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #17: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work)
frame #18: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #19: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #20: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #21: _PyEval_EvalFrameDefault + 0x1ab0 (0x55e440456470 in ./work)
frame #22: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work)
frame #23: _PyFunction_FastCallDict + 0x1bb (0x55e44042ce1b in ./work)
frame #24: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #25: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #26: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #27: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work)
frame #28: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work)
frame #29: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work)
frame #30: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #31: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work)
frame #32: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #33: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #34: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #35: _PyEval_EvalFrameDefault + 0x1ab0 (0x55e440456470 in ./work)
frame #36: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work)
frame #37: _PyFunction_FastCallDict + 0x1bb (0x55e44042ce1b in ./work)
frame #38: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #39: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #40: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #41: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work)
frame #42: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work)
frame #43: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work)
frame #44: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #45: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work)
frame #46: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #47: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #48: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #49: _PyEval_EvalFrameDefault + 0x1ab0 (0x55e440456470 in ./work)
frame #50: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work)
frame #51: _PyFunction_FastCallDict + 0x1bb (0x55e44042ce1b in ./work)
frame #52: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #53: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #54: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #55: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work)
frame #56: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work)
frame #57: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work)
frame #58: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #59: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work)
frame #60: _PyFunction_FastCallDict + 0x3db (0x55e44042d03b in ./work)
frame #61: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #62: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #63: PyObject_Call + 0x3e (0x55e4403a299e in ./work)

Exception ignored in: <bound method tqdm.__del__ of Overall Progress:   0%|                                                                    | 0/1 [00:01<?, ?it/s]>
Traceback (most recent call last):
  File "/root/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 889, in __del__
  File "/root/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 1095, in close
  File "/root/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 441, in _decr_instances
  File "/root/anaconda3/lib/python3.6/_weakrefset.py", line 109, in remove
KeyError: <weakref at 0x7f6e43cdc598; to 'tqdm' at 0x7f6e432fe400>

cuuupid commented 6 years ago

Is this possibly an issue with K80 GPUs? I also have a K80 and can't get past this issue

laaRaa commented 6 years ago

Hi,

I couldn’t solve this error neither (I don't have a K80 GPU but a Quadro K6000). Instead I installed an older version of this code (the one before the commit of the 22nd of August) and had to change the three make.sh scripts as explained in the issue #33 opened on the 25 of January. In my case -arch=sm_30 worked (instead of -arch=sm_52). In the current code these make.sh files are replaced by setup.py. I also had to use PyTorch 0.4.0 and Python 3.6.5.

Hope it helps!

yuanzhou15 commented 5 years ago

Hi, Any updates to this issue? I'm also getting the same error on Google Colab, and don't know how to get past it.

cuuupid commented 5 years ago

I haven't gotten any of the workarounds posted to work so far on K80 gpus, and as far as I know there aren't updates to this (but I would love to be proved wrong!).

I originally thought it referred to not being able to find one of the CUDA libraries, but no amount of fresh installs has fixed this. What is your environment?

yuanzhou15 commented 5 years ago

python 3 CUDA nvcc 9.2.148 pytorch torch-0.4.1 The OS on google Colab is Ubuntu @pshah123 did you end up running it on another GPU?

ahmedbilal commented 5 years ago

python 3 CUDA nvcc 9.2.148 pytorch torch-0.4.1 The OS on google Colab is Ubuntu @pshah123 did you end up running it on another GPU?

@yuanzhou15 Did you make it to work on Google Colab? I am facing the same issue.

yuanzhou15 commented 5 years ago

@pshah123 No I'm still facing the same issues

lianuo commented 5 years ago

It is weird , I do not have this problem before, and successfully run the demo.(python 3.6 ubuntu 16.04 cuda 9.2 pytorch 0.4.1 gtx1080) but After I change cuda from 9.2 to 9.0 the problem happend.... Hope this information could help...

lianuo commented 5 years ago

Hi,I think I have found the reason, I can run the code again now, with

cuda 9.2 ubuntu16.04 python 3.6 pytorch 0.4.1

I just upgrade my GPU driver from 384 to Driver Version: 396.37

please try it, the problem may just because of old version of CUDA driver which have no functions which the code need to call.

huangbiubiu commented 5 years ago

Hi,I think I have found the reason, I can run the code again now, with

cuda 9.2 ubuntu16.04 python 3.6 pytorch 0.4.1

I just upgrade my GPU driver from 384 to Driver Version: 396.37

please try it, the problem may just because of old version of CUDA driver which have no functions which the code need to call.

I'm working on

CUDA 9.0
PyTorch 0.4.1
GPU Drivers 390.48

I run the demo well on GTX 1080Ti, but it raise this error on Tesla K40c (same computer, just switch GPU with CUDA_VISIBLE_DEVICES). So I think maybe it's not caused by drivers?

huangbiubiu commented 5 years ago

Problem fixed. I think just add '-gencode', 'arch=compute_xx,code=sm_xx' to nvcc_args can fix the problem. If the problem still exists after adding the line, please check as follows:

Make sure you modified all 3 setup.py files in 3 packages: channelnorm_package, correlation_package and resample2d_package
Make sure you have removed all intermedia files, including __pycache__/, dist/, *.egg-info, build/. Python will install with these files without recompiling if they have existed.

ken0406zero commented 5 years ago

Problem fixed. I think just add '-gencode', 'arch=compute_xx,code=sm_xx' to nvcc_args can fix the problem. If the problem still exists after adding the line, please check as follows:
* Make sure you modified all 3 `setup.py` files in 3 packages: `channelnorm_package`, `correlation_package` and `resample2d_package`

* Make sure you have removed all intermedia files, including `__pycache__/`, `dist/`, `*.egg-info`, `build/`. Python will install with these files without recompiling if they have existed.

Can you tell me how to modify 3 setup.py files? Thank you.

huangbiubiu commented 5 years ago

@ken0406zero Add '-gencode', 'arch=compute_xx,code=sm_xx' to nvcc_args. nvcc_args like this:

nvcc_args = [
    '-gencode', 'arch=compute_30,code=sm_30',
    '-gencode', 'arch=compute_35,code=sm_35',
    '-gencode', 'arch=compute_37,code=sm_37',
    '-gencode', 'arch=compute_50,code=sm_50',
    '-gencode', 'arch=compute_52,code=sm_52',
    '-gencode', 'arch=compute_60,code=sm_60',
    '-gencode', 'arch=compute_61,code=sm_61',
    '-gencode', 'arch=compute_70,code=sm_70',
    '-gencode', 'arch=compute_xx,code=sm_xx'
]

'-gencode', 'arch=compute_xx,code=sm_xx' is what you added.

To determine what xx is, check https://developer.nvidia.com/cuda-gpus.

ken0406zero commented 5 years ago

Thank you so much @huangbiubiu

NazihaS commented 5 years ago

Please i have the same problem .... where can i find setup.py and the 3 packages: channelnorm_package, correlation_package and resample2d_package i couldn't find nvcc_args @huangbiubiu can you explain more by giving all the steps please. thanks in advance .

NazihaS commented 5 years ago

Please i have the same problem .... where can i find setup.py and the 3 packages: channelnorm_package, correlation_package and resample2d_package i couldn't find nvcc_args @ken0406zero can you explain more by giving all the steps please. thanks in advance .

Lanselott commented 5 years ago

Please i have the same problem .... where can i find setup.py and the 3 packages: channelnorm_package, correlation_package and resample2d_package i couldn't find nvcc_args @ken0406zero can you explain more by giving all the steps please. thanks in advance .

Hi @NazihaS ,did you solve this issue? This problem happened on my Titan V.

huangbiubiu commented 5 years ago

@NazihaS I think simply searching can find nvcc_args: https://github.com/NVIDIA/flownet2-pytorch/search?q=nvcc_args&unscoped_q=nvcc_args

JMarzz commented 5 years ago

@huangbiubiu 你好~请问一下我现在用FlowNet2C跑inference没问题但是用2 或者 2CSS都会出现no kernel 按照你说的添加gencode,清理了那几个文件也都不起作用请问有没有其他解决方法呢? 另外配置是GeForce 960M 感觉应该不是显卡性能问题吧,,ubuntu18 pytorch 0.4.1 CUDA 9 感谢!

maximelianos commented 5 years ago

@ken0406zero Add '-gencode', 'arch=compute_xx,code=sm_xx' to nvcc_args. nvcc_args like this:
nvcc_args = [
    '-gencode', 'arch=compute_30,code=sm_30',
    '-gencode', 'arch=compute_35,code=sm_35',
    '-gencode', 'arch=compute_37,code=sm_37',
    '-gencode', 'arch=compute_50,code=sm_50',
    '-gencode', 'arch=compute_52,code=sm_52',
    '-gencode', 'arch=compute_60,code=sm_60',
    '-gencode', 'arch=compute_61,code=sm_61',
    '-gencode', 'arch=compute_70,code=sm_70',
    '-gencode', 'arch=compute_xx,code=sm_xx'
]
'-gencode', 'arch=compute_xx,code=sm_xx' is what you added.

To determine what xx is, check https://developer.nvidia.com/cuda-gpus.

Thank you for the solution and link! I tried launching this on Google Colab, and surprisingly found that Tesla K80 with CUDA 10.0! has Computing Capability 3.7, which is kind of old. I thought that higher CUDA versions correspond to higher computing capability. I'm not a video card architecture expert after all:) After adding the code generation line the error disappeared.

dilipv09 commented 4 years ago

I am getting an analogous error message when running inference using main.py. My system is somewhat different:

* CUDA V9.0.176

* cuDNN V7.3.0

* NVIDIA K80 GPU

* PyTorch 0.4.1.post2

* Python 3.6.5

* Ubuntu 16.04.4 LTS

The install and execution is dockerized and I'm happy to share the image if helpful.

# python main.py --inference --model FlowNet2 --save_flow --inference_dataset MpiSintelClean \
> --inference_dataset_root /data/MPISintel/training \
> --resume /workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar
Parsing Arguments
  [0.029s] batch_size: 8
  [0.030s] crop_size: [256, 256]
  [0.030s] fp16: False
  [0.030s] fp16_scale: 1024.0
  [0.030s] gradient_clip: None
  [0.030s] inference: True
  [0.030s] inference_batch_size: 1
  [0.030s] inference_dataset: MpiSintelClean
  [0.030s] inference_dataset_replicates: 1
  [0.030s] inference_dataset_root: /data/MPISintel/training
  [0.030s] inference_n_batches: -1
  [0.030s] inference_size: [-1, -1]
  [0.030s] log_frequency: 1
  [0.030s] loss: L1Loss
  [0.030s] model: FlowNet2
  [0.030s] model_batchNorm: False
  [0.030s] model_div_flow: 20.0
  [0.030s] name: run
  [0.030s] no_cuda: False
  [0.030s] number_gpus: 1
  [0.030s] number_workers: 8
  [0.030s] optimizer: Adam
  [0.030s] optimizer_amsgrad: False
  [0.030s] optimizer_betas: (0.9, 0.999)
  [0.030s] optimizer_eps: 1e-08
  [0.030s] optimizer_lr: 0.001
  [0.030s] optimizer_weight_decay: 0
  [0.030s] render_validation: False
  [0.030s] resume: /workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar
  [0.030s] rgb_max: 255.0
  [0.030s] save: ./work
  [0.030s] save_flow: True
  [0.030s] schedule_lr_fraction: 10
  [0.030s] schedule_lr_frequency: 0
  [0.030s] seed: 1
  [0.030s] skip_training: False
  [0.030s] skip_validation: False
  [0.030s] start_epoch: 1
  [0.030s] total_epochs: 10000
  [0.030s] train_n_batches: -1
  [0.030s] training_dataset: MpiSintelFinal
  [0.030s] training_dataset_replicates: 1
  [0.030s] training_dataset_root: ./MPI-Sintel/flow/training
  [0.030s] validation_dataset: MpiSintelClean
  [0.030s] validation_dataset_replicates: 1
  [0.030s] validation_dataset_root: ./MPI-Sintel/flow/training
  [0.030s] validation_frequency: 5
  [0.030s] validation_n_batches: -1
  [0.032s] Operation finished

Source Code
  Current Git Hash: b'532613d4fa46e544ddc309a8aa9e6b65dc91af21'

Initializing Datasets
  [0.050s] Inference Dataset: MpiSintelClean
  [0.117s] Inference Input: [3, 2, 384, 1024]
  [0.353s] Inference Targets: [2, 384, 1024]
  [0.354s] Operation finished

Building FlowNet2 model
  [5.002s] Effective Batch Size: 8
  [5.004s] Number of parameters: 162518834
  [5.004s] Initializing CUDA
  [7.209s] Parallelizing
  [7.211s] Loading checkpoint '/workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar'
  [7.670s] Loaded checkpoint '/workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar' (at epoch 0)
  [7.670s] Initializing save directory: ./work
  [7.672s] Operation finished

Initializing Adam Optimizer
  [0.001s] amsgrad = False (<class 'bool'>)
  [0.001s] weight_decay = 0 (<class 'int'>)
  [0.001s] eps = 1e-08 (<class 'float'>)
  [0.001s] betas = (0.9, 0.999) (<class 'tuple'>)
  [0.001s] lr = 0.001 (<class 'float'>)
  [0.001s] Operation finished

Overall Progress:   0%|                                                       |              0/1 [00:00<?, ?it/s]
Inferencing :   0%| ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).
error in correlation_forward_cuda_kernel: no kernel image is available for execution on the device
Traceback (most recent call last):
  File "main.py", line 403, in <module>
    stats = inference(args=args, epoch=epoch - 1, data_loader=inference_loader,              model=model_and_loss, offset=offset)
  File "main.py", line 367, in inference
    losses, output = model(data[0], target[0], inference=True)
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "main.py", line 170, in forward
    output = self.model(data)
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/workspace/flownet2-pytorch/models.py", line 118, in forward
    flownetc_flow2 = self.flownetc(x)[0]
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/workspace/flownet2-pytorch/networks/FlowNetC.py", line 86, in forward
    out_corr = self.corr(out_conv3a, out_conv3b) # False
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/workspace/flownet2-pytorch/networks/correlation_package/correlation.py", line 59, in forward
    result = CorrelationFunction(self.pad_size, self.kernel_size, self.max_displ             acement,self.stride1, self.stride2, self.corr_multiply)(input1, input2)
  File "/workspace/flownet2-pytorch/networks/correlation_package/correlation.py", line 27, in forward
    self.pad_size, self.kernel_size, self.max_displacement,self.stride1, self.stride2, self.corr_multiply)
RuntimeError: CUDA call failed (correlation_forward_cuda at correlation_cuda.cc:79)
frame #0: <unknown function> + 0x140c8 (0x7f6e498160c8 in /root/anaconda3/lib/py             thon3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_             cuda.cpython-36m-x86_64-linux-gnu.so)
frame #1: <unknown function> + 0x1430e (0x7f6e4981630e in /root/anaconda3/lib/py             thon3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_             cuda.cpython-36m-x86_64-linux-gnu.so)
frame #2: <unknown function> + 0x107b1 (0x7f6e498127b1 in /root/anaconda3/lib/py             thon3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_             cuda.cpython-36m-x86_64-linux-gnu.so)
frame #3: _PyCFunction_FastCallDict + 0x154 (0x55e4403a2b94 in ./work)
frame #4: <unknown function> + 0x19e67c (0x55e44043267c in ./work)
frame #5: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #6: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work)
frame #7: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #8: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #9: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #10: THPFunction_do_forward(THPFunction*, _object*) + 0x2ad (0x7f6e73b61fbd in /root/anaconda3/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #11: PyCFunction_Call + 0x5f (0x55e4403a598f in ./work)
frame #12: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #13: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work)
frame #14: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work)
frame #15: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work)
frame #16: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #17: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work)
frame #18: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #19: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #20: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #21: _PyEval_EvalFrameDefault + 0x1ab0 (0x55e440456470 in ./work)
frame #22: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work)
frame #23: _PyFunction_FastCallDict + 0x1bb (0x55e44042ce1b in ./work)
frame #24: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #25: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #26: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #27: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work)
frame #28: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work)
frame #29: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work)
frame #30: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #31: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work)
frame #32: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #33: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #34: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #35: _PyEval_EvalFrameDefault + 0x1ab0 (0x55e440456470 in ./work)
frame #36: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work)
frame #37: _PyFunction_FastCallDict + 0x1bb (0x55e44042ce1b in ./work)
frame #38: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #39: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #40: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #41: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work)
frame #42: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work)
frame #43: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work)
frame #44: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #45: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work)
frame #46: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #47: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #48: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #49: _PyEval_EvalFrameDefault + 0x1ab0 (0x55e440456470 in ./work)
frame #50: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work)
frame #51: _PyFunction_FastCallDict + 0x1bb (0x55e44042ce1b in ./work)
frame #52: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #53: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #54: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #55: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work)
frame #56: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work)
frame #57: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work)
frame #58: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #59: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work)
frame #60: _PyFunction_FastCallDict + 0x3db (0x55e44042d03b in ./work)
frame #61: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #62: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #63: PyObject_Call + 0x3e (0x55e4403a299e in ./work)

Exception ignored in: <bound method tqdm.__del__ of Overall Progress:   0%|                                                                    | 0/1 [00:01<?, ?it/s]>
Traceback (most recent call last):
  File "/root/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 889, in __del__
  File "/root/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 1095, in close
  File "/root/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 441, in _decr_instances
  File "/root/anaconda3/lib/python3.6/_weakrefset.py", line 109, in remove
KeyError: <weakref at 0x7f6e43cdc598; to 'tqdm' at 0x7f6e432fe400>

malcolmgarner-movement commented 4 years ago

Hi @huangbiubiu, thanks for your help on this issue!

I've tried what you recommended earlier, however I haven't had any luck. I am running on a Tesla K80, and from my understanding, that would mean that '-gencode', 'arch=compute_xx,code=sm_xx' line for me should look like '-gencode', 'arch=compute_37,code=sm_37' correct?

Let me know if I've misunderstood. Thanks again.

huangbiubiu commented 4 years ago

Hi @huangbiubiu, thanks for your help on this issue!

I've tried what you recommended earlier, however I haven't had any luck. I am running on a Tesla K80, and from my understanding, that would mean that '-gencode', 'arch=compute_xx,code=sm_xx' line for me should look like '-gencode', 'arch=compute_37,code=sm_37' correct?

Let me know if I've misunderstood. Thanks again.

Looks correct.

limacv commented 4 years ago

I met this problem too. I'm using python3.7, torch 1.4.0, RTX2060 in Windows. Just for reference, I solved the problem by removing all the stream when calling the cuda kernel, so that cuda just call the default stream all the time. I think my problem may be due to the latest pytorch are transferring at to c10.

cuuupid commented 4 years ago

@oblime out of curiosity does this fix also work on Torch 1.5? I'm still encountering the issue when working with the face model specifically

sssssyf commented 4 years ago

Problem fixed. I think just add '-gencode', 'arch=compute_xx,code=sm_xx' to nvcc_args can fix the problem. If the problem still exists after adding the line, please check as follows:

Make sure you modified all 3 setup.py files in 3 packages: channelnorm_package, correlation_package and resample2d_package

Make sure you have removed all intermedia files, including __pycache__/, dist/, *.egg-info, build/. Python will install with these files without recompiling if they have existed.

how can I find and delete files such __pycache__/, dist/, *.egg-info, `build/,because I can't python setup.py install again.I used to setup 3 files sucessflully,when I python setup.py install again error:command ' cl.exe' failed: No such file or directory

sssssyf commented 4 years ago

Problem fixed. I think just add '-gencode', 'arch=compute_xx,code=sm_xx' to nvcc_args can fix the problem. If the problem still exists after adding the line, please check as follows:

Make sure you modified all 3 setup.py files in 3 packages: channelnorm_package, correlation_package and resample2d_package

Make sure you have removed all intermedia files, including __pycache__/, dist/, *.egg-info, build/. Python will install with these files without recompiling if they have existed.

how can I find and delete files such __pycache__/, dist/, *.egg-info, `build/,because I can't python setup.py install again.I used to setup 3 files sucessflully,when I python setup.py install again error:command ' cl.exe' failed: No such file or directory

@huangbiubiu

sssssyf commented 4 years ago

how can I find and delete files such pycache/, dist/, *.egg-info, `build/,because I can't python setup.py install again.I used to setup 3 files sucessflully,when I python setup.py install again error:command ' cl.exe' failed: No such file or directory @huangbiubiu

jih189 commented 4 years ago

@sssssyf it could be in the correlation_package or where you run python setup.py install. you can just use fine -name to search the file.

huangruofei commented 4 years ago

这是要建立三个setup吗？没有setup文件怎么办

Gauravv97 commented 4 years ago

I encountered the same issue. It's most likely due to incompatible torch/cuda version. I found a combination that works on Colab. Here is the link to the notebook. Hopefully it helps.

tyrink commented 3 years ago

I encountered the same issue. It's most likely due to incompatible torch/cuda version. I found a combination that works on Colab. Here is the link to the notebook. Hopefully it helps.

It helps a lot, thanks so much!

hduxiao commented 2 years ago

@ken0406zero Add '-gencode', 'arch=compute_xx,code=sm_xx' to nvcc_args. nvcc_args like this:
nvcc_args = [
    '-gencode', 'arch=compute_30,code=sm_30',
    '-gencode', 'arch=compute_35,code=sm_35',
    '-gencode', 'arch=compute_37,code=sm_37',
    '-gencode', 'arch=compute_50,code=sm_50',
    '-gencode', 'arch=compute_52,code=sm_52',
    '-gencode', 'arch=compute_60,code=sm_60',
    '-gencode', 'arch=compute_61,code=sm_61',
    '-gencode', 'arch=compute_70,code=sm_70',
    '-gencode', 'arch=compute_xx,code=sm_xx'
]
'-gencode', 'arch=compute_xx,code=sm_xx' is what you added.

To determine what xx is, check https://developer.nvidia.com/cuda-gpus.

solve my problem, thanks

NVIDIA / flownet2-pytorch

error in correlation_forward_cuda_kernel: no kernel image is available for execution on the device #86