NVIDIA / flownet2-pytorch

Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
Other
3.13k stars 740 forks source link

error in correlation_forward_cuda_kernel: no kernel image is available for execution on the device #86

Open laaRaa opened 6 years ago

laaRaa commented 6 years ago

Hi,

I installed all the dependencies and followed the steps that are listed to install and run flownet but I have the following error when using "run_a_pair.py" . Any ideas?

The parameters of the system are the following: CUDA: release 9.1, V9.1.85 PYTORCH: 0.4.1 UBUNTU: Ubuntu 18.04 LTS PYTHON: 3.6.5

Thanks!

python3 run_a_pair.py error in correlation_forward_cuda_kernel: no kernel image is available for execution on the device Traceback (most recent call last): File "run_a_pair.py", line 31, in result = net(im).squeeze() File "/home/raad/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, kwargs) File "/home/raad/flownet2-pytorch_python3/models.py", line 118, in forward flownetc_flow2 = self.flownetc(x)[0] File "/home/raad/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, *kwargs) File "/home/raad/flownet2-pytorch_python3/networks/FlowNetC.py", line 86, in forward out_corr = self.corr(out_conv3a, out_conv3b) # False File "/home/raad/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(input, kwargs) File "/home/raad/flownet2-pytorch_python3/networks/correlation_package/correlation.py", line 59, in forward result = CorrelationFunction(self.pad_size, self.kernel_size, self.max_displacement,self.stride1, self.stride2, self.corr_multiply)(input1, input2) File "/home/raad/flownet2-pytorch_python3/networks/correlation_package/correlation.py", line 27, in forward self.pad_size, self.kernel_size, self.max_displacement,self.stride1, self.stride2, self.corr_multiply) RuntimeError: CUDA call failed (correlation_forward_cuda at correlation_cuda.cc:79) frame #0: + 0x135a7 (0x7f64fe8845a7 in /home/raad/local/lib/python3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_cuda.cpython-36m-x86_64-linux-gnu.so) frame #1: + 0x102ef (0x7f64fe8812ef in /home/raad/local/lib/python3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_cuda.cpython-36m-x86_64-linux-gnu.so) frame #2: _PyCFunction_FastCallKeywords + 0x26b (0x4c549b in python3) frame #3: python3() [0x54ffe4] frame #4: _PyEval_EvalFrameDefault + 0x362f (0x5546cf in python3) frame #5: python3() [0x54f0e8] frame #6: _PyFunction_FastCallDict + 0x2a2 (0x558ef2 in python3) frame #7: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3) frame #8: PyObject_Call + 0x3e (0x459eee in python3) frame #9: THPFunction_do_forward(THPFunction, _object) + 0x2ad (0x7f652c26fc3d in /home/raad/.local/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so) frame #10: PyCFunction_Call + 0xbd (0x4c517d in python3) frame #11: PyObject_Call + 0x3e (0x459eee in python3) frame #12: python3() [0x4e0e9b] frame #13: _PyObject_FastCallDict + 0xa3 (0x45a0e3 in python3) frame #14: python3() [0x54fd37] frame #15: _PyEval_EvalFrameDefault + 0x362f (0x5546cf in python3) frame #16: python3() [0x54f0e8] frame #17: _PyFunction_FastCallDict + 0x2a2 (0x558ef2 in python3) frame #18: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3) frame #19: PyObject_Call + 0x3e (0x459eee in python3) frame #20: _PyEval_EvalFrameDefault + 0x1ba9 (0x552c49 in python3) frame #21: python3() [0x54fbe1] frame #22: _PyFunction_FastCallDict + 0x1c9 (0x558e19 in python3) frame #23: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3) frame #24: PyObject_Call + 0x3e (0x459eee in python3) frame #25: python3() [0x4e0e9b] frame #26: _PyObject_FastCallDict + 0xa3 (0x45a0e3 in python3) frame #27: python3() [0x54fd37] frame #28: _PyEval_EvalFrameDefault + 0x362f (0x5546cf in python3) frame #29: python3() [0x54f0e8] frame #30: _PyFunction_FastCallDict + 0x2a2 (0x558ef2 in python3) frame #31: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3) frame #32: PyObject_Call + 0x3e (0x459eee in python3) frame #33: _PyEval_EvalFrameDefault + 0x1ba9 (0x552c49 in python3) frame #34: python3() [0x54fbe1] frame #35: _PyFunction_FastCallDict + 0x1c9 (0x558e19 in python3) frame #36: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3) frame #37: PyObject_Call + 0x3e (0x459eee in python3) frame #38: python3() [0x4e0e9b] frame #39: _PyObject_FastCallDict + 0xa3 (0x45a0e3 in python3) frame #40: python3() [0x54fd37] frame #41: _PyEval_EvalFrameDefault + 0x362f (0x5546cf in python3) frame #42: python3() [0x54f0e8] frame #43: _PyFunction_FastCallDict + 0x2a2 (0x558ef2 in python3) frame #44: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3) frame #45: PyObject_Call + 0x3e (0x459eee in python3) frame #46: _PyEval_EvalFrameDefault + 0x1ba9 (0x552c49 in python3) frame #47: python3() [0x54fbe1] frame #48: _PyFunction_FastCallDict + 0x1c9 (0x558e19 in python3) frame #49: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3) frame #50: PyObject_Call + 0x3e (0x459eee in python3) frame #51: python3() [0x4e0e9b] frame #52: _PyObject_FastCallDict + 0xa3 (0x45a0e3 in python3) frame #53: python3() [0x54fd37] frame #54: _PyEval_EvalFrameDefault + 0x362f (0x5546cf in python3) frame #55: python3() [0x54fbe1] frame #56: PyEval_EvalCode + 0x23 (0x550b93 in python3) frame #57: PyRun_FileExFlags + 0x169 (0x42b519 in python3) frame #58: PyRun_SimpleFileExFlags + 0xe5 (0x42b705 in python3) frame #59: Py_Main + 0xccb (0x441fcb in python3) frame #60: main + 0x184 (0x421ff4 in python3) frame #61: __libc_start_main + 0xe7 (0x7f65408fab97 in /lib/x86_64-linux-gnu/libc.so.6) frame #62: _start + 0x2a (0x4220aa in python3)

cuuupid commented 6 years ago

Also getting this, using the Flownet2 models in vid2vid.

error in correlation_forward_cuda_kernel: no kernel image is available for execution on the device
Traceback (most recent call last):
  File "train.py", line 273, in <module>
    train()
  File "train.py", line 105, in train
    flow_ref, conf_ref = flowNet(real_B, real_B_prev)  # reference flows and confidences
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/vid2vid/models/flownet.py", line 33, in forward
    flow, conf = self.compute_flow_and_conf(input_A, input_B)
  File "/home/ubuntu/vid2vid/models/flownet.py", line 50, in compute_flow_and_conf
    flow1 = self.flowNet(data1)
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/vid2vid/models/flownet2_pytorch/models.py", line 126, in forward
    flownetc_flow2 = self.flownetc(x)[0]
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/vid2vid/models/flownet2_pytorch/networks/FlowNetC.py", line 86, in forward
    out_corr = self.corr(out_conv3a, out_conv3b) # False
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/vid2vid/models/flownet2_pytorch/networks/correlation_package/correlation.py", line 59, in forward
    result = CorrelationFunction(self.pad_size, self.kernel_size, self.max_displacement,self.stride1, self.stride2, self.corr_multiply)(input1, input2)
  File "/home/ubuntu/vid2vid/models/flownet2_pytorch/networks/correlation_package/correlation.py", line 27, in forward
    self.pad_size, self.kernel_size, self.max_displacement,self.stride1, self.stride2, self.corr_multiply)
RuntimeError: CUDA call failed (correlation_forward_cuda at correlation_cuda.cc:79)
frame #0: <unknown function> + 0x140f8 (0x7f15885b40f8 in /home/ubuntu/.local/lib/python3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #1: <unknown function> + 0x1433e (0x7f15885b433e in /home/ubuntu/.local/lib/python3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #2: <unknown function> + 0x107e1 (0x7f15885b07e1 in /home/ubuntu/.local/lib/python3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_cuda.cpython-36m-x86_64-linux-gnu.so)
<omitting python frames>
frame #10: THPFunction_do_forward(THPFunction*, _object*) + 0x2ad (0x7f15c17a7f8d in /home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
mawah commented 6 years ago

I am getting an analogous error message when running inference using main.py. My system is somewhat different:

The install and execution is dockerized and I'm happy to share the image if helpful.

# python main.py --inference --model FlowNet2 --save_flow --inference_dataset MpiSintelClean \
> --inference_dataset_root /data/MPISintel/training \
> --resume /workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar
Parsing Arguments
  [0.029s] batch_size: 8
  [0.030s] crop_size: [256, 256]
  [0.030s] fp16: False
  [0.030s] fp16_scale: 1024.0
  [0.030s] gradient_clip: None
  [0.030s] inference: True
  [0.030s] inference_batch_size: 1
  [0.030s] inference_dataset: MpiSintelClean
  [0.030s] inference_dataset_replicates: 1
  [0.030s] inference_dataset_root: /data/MPISintel/training
  [0.030s] inference_n_batches: -1
  [0.030s] inference_size: [-1, -1]
  [0.030s] log_frequency: 1
  [0.030s] loss: L1Loss
  [0.030s] model: FlowNet2
  [0.030s] model_batchNorm: False
  [0.030s] model_div_flow: 20.0
  [0.030s] name: run
  [0.030s] no_cuda: False
  [0.030s] number_gpus: 1
  [0.030s] number_workers: 8
  [0.030s] optimizer: Adam
  [0.030s] optimizer_amsgrad: False
  [0.030s] optimizer_betas: (0.9, 0.999)
  [0.030s] optimizer_eps: 1e-08
  [0.030s] optimizer_lr: 0.001
  [0.030s] optimizer_weight_decay: 0
  [0.030s] render_validation: False
  [0.030s] resume: /workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar
  [0.030s] rgb_max: 255.0
  [0.030s] save: ./work
  [0.030s] save_flow: True
  [0.030s] schedule_lr_fraction: 10
  [0.030s] schedule_lr_frequency: 0
  [0.030s] seed: 1
  [0.030s] skip_training: False
  [0.030s] skip_validation: False
  [0.030s] start_epoch: 1
  [0.030s] total_epochs: 10000
  [0.030s] train_n_batches: -1
  [0.030s] training_dataset: MpiSintelFinal
  [0.030s] training_dataset_replicates: 1
  [0.030s] training_dataset_root: ./MPI-Sintel/flow/training
  [0.030s] validation_dataset: MpiSintelClean
  [0.030s] validation_dataset_replicates: 1
  [0.030s] validation_dataset_root: ./MPI-Sintel/flow/training
  [0.030s] validation_frequency: 5
  [0.030s] validation_n_batches: -1
  [0.032s] Operation finished

Source Code
  Current Git Hash: b'532613d4fa46e544ddc309a8aa9e6b65dc91af21'

Initializing Datasets
  [0.050s] Inference Dataset: MpiSintelClean
  [0.117s] Inference Input: [3, 2, 384, 1024]
  [0.353s] Inference Targets: [2, 384, 1024]
  [0.354s] Operation finished

Building FlowNet2 model
  [5.002s] Effective Batch Size: 8
  [5.004s] Number of parameters: 162518834
  [5.004s] Initializing CUDA
  [7.209s] Parallelizing
  [7.211s] Loading checkpoint '/workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar'
  [7.670s] Loaded checkpoint '/workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar' (at epoch 0)
  [7.670s] Initializing save directory: ./work
  [7.672s] Operation finished

Initializing Adam Optimizer
  [0.001s] amsgrad = False (<class 'bool'>)
  [0.001s] weight_decay = 0 (<class 'int'>)
  [0.001s] eps = 1e-08 (<class 'float'>)
  [0.001s] betas = (0.9, 0.999) (<class 'tuple'>)
  [0.001s] lr = 0.001 (<class 'float'>)
  [0.001s] Operation finished

Overall Progress:   0%|                                                       |              0/1 [00:00<?, ?it/s]
Inferencing :   0%| ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).
error in correlation_forward_cuda_kernel: no kernel image is available for execution on the device
Traceback (most recent call last):
  File "main.py", line 403, in <module>
    stats = inference(args=args, epoch=epoch - 1, data_loader=inference_loader,              model=model_and_loss, offset=offset)
  File "main.py", line 367, in inference
    losses, output = model(data[0], target[0], inference=True)
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "main.py", line 170, in forward
    output = self.model(data)
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/workspace/flownet2-pytorch/models.py", line 118, in forward
    flownetc_flow2 = self.flownetc(x)[0]
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/workspace/flownet2-pytorch/networks/FlowNetC.py", line 86, in forward
    out_corr = self.corr(out_conv3a, out_conv3b) # False
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/workspace/flownet2-pytorch/networks/correlation_package/correlation.py", line 59, in forward
    result = CorrelationFunction(self.pad_size, self.kernel_size, self.max_displ             acement,self.stride1, self.stride2, self.corr_multiply)(input1, input2)
  File "/workspace/flownet2-pytorch/networks/correlation_package/correlation.py", line 27, in forward
    self.pad_size, self.kernel_size, self.max_displacement,self.stride1, self.stride2, self.corr_multiply)
RuntimeError: CUDA call failed (correlation_forward_cuda at correlation_cuda.cc:79)
frame #0: <unknown function> + 0x140c8 (0x7f6e498160c8 in /root/anaconda3/lib/py             thon3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_             cuda.cpython-36m-x86_64-linux-gnu.so)
frame #1: <unknown function> + 0x1430e (0x7f6e4981630e in /root/anaconda3/lib/py             thon3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_             cuda.cpython-36m-x86_64-linux-gnu.so)
frame #2: <unknown function> + 0x107b1 (0x7f6e498127b1 in /root/anaconda3/lib/py             thon3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_             cuda.cpython-36m-x86_64-linux-gnu.so)
frame #3: _PyCFunction_FastCallDict + 0x154 (0x55e4403a2b94 in ./work)
frame #4: <unknown function> + 0x19e67c (0x55e44043267c in ./work)
frame #5: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #6: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work)
frame #7: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #8: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #9: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #10: THPFunction_do_forward(THPFunction*, _object*) + 0x2ad (0x7f6e73b61fbd in /root/anaconda3/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #11: PyCFunction_Call + 0x5f (0x55e4403a598f in ./work)
frame #12: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #13: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work)
frame #14: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work)
frame #15: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work)
frame #16: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #17: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work)
frame #18: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #19: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #20: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #21: _PyEval_EvalFrameDefault + 0x1ab0 (0x55e440456470 in ./work)
frame #22: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work)
frame #23: _PyFunction_FastCallDict + 0x1bb (0x55e44042ce1b in ./work)
frame #24: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #25: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #26: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #27: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work)
frame #28: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work)
frame #29: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work)
frame #30: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #31: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work)
frame #32: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #33: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #34: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #35: _PyEval_EvalFrameDefault + 0x1ab0 (0x55e440456470 in ./work)
frame #36: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work)
frame #37: _PyFunction_FastCallDict + 0x1bb (0x55e44042ce1b in ./work)
frame #38: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #39: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #40: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #41: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work)
frame #42: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work)
frame #43: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work)
frame #44: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #45: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work)
frame #46: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #47: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #48: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #49: _PyEval_EvalFrameDefault + 0x1ab0 (0x55e440456470 in ./work)
frame #50: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work)
frame #51: _PyFunction_FastCallDict + 0x1bb (0x55e44042ce1b in ./work)
frame #52: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #53: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #54: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #55: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work)
frame #56: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work)
frame #57: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work)
frame #58: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #59: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work)
frame #60: _PyFunction_FastCallDict + 0x3db (0x55e44042d03b in ./work)
frame #61: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #62: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #63: PyObject_Call + 0x3e (0x55e4403a299e in ./work)

Exception ignored in: <bound method tqdm.__del__ of Overall Progress:   0%|                                                                    | 0/1 [00:01<?, ?it/s]>
Traceback (most recent call last):
  File "/root/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 889, in __del__
  File "/root/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 1095, in close
  File "/root/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 441, in _decr_instances
  File "/root/anaconda3/lib/python3.6/_weakrefset.py", line 109, in remove
KeyError: <weakref at 0x7f6e43cdc598; to 'tqdm' at 0x7f6e432fe400>
cuuupid commented 6 years ago

Is this possibly an issue with K80 GPUs? I also have a K80 and can't get past this issue

laaRaa commented 6 years ago

Hi,

I couldn’t solve this error neither (I don't have a K80 GPU but a Quadro K6000). Instead I installed an older version of this code (the one before the commit of the 22nd of August) and had to change the three make.sh scripts as explained in the issue #33 opened on the 25 of January. In my case -arch=sm_30 worked (instead of -arch=sm_52). In the current code these make.sh files are replaced by setup.py. I also had to use PyTorch 0.4.0 and Python 3.6.5.

Hope it helps!

yuanzhou15 commented 5 years ago

Hi, Any updates to this issue? I'm also getting the same error on Google Colab, and don't know how to get past it.

cuuupid commented 5 years ago

I haven't gotten any of the workarounds posted to work so far on K80 gpus, and as far as I know there aren't updates to this (but I would love to be proved wrong!).

I originally thought it referred to not being able to find one of the CUDA libraries, but no amount of fresh installs has fixed this. What is your environment?

yuanzhou15 commented 5 years ago

python 3 CUDA nvcc 9.2.148 pytorch torch-0.4.1 The OS on google Colab is Ubuntu @pshah123 did you end up running it on another GPU?

ahmedbilal commented 5 years ago

python 3 CUDA nvcc 9.2.148 pytorch torch-0.4.1 The OS on google Colab is Ubuntu @pshah123 did you end up running it on another GPU?

@yuanzhou15 Did you make it to work on Google Colab? I am facing the same issue.

yuanzhou15 commented 5 years ago

@pshah123 No I'm still facing the same issues

lianuo commented 5 years ago

It is weird , I do not have this problem before, and successfully run the demo.(python 3.6 ubuntu 16.04 cuda 9.2 pytorch 0.4.1 gtx1080) but After I change cuda from 9.2 to 9.0 the problem happend.... Hope this information could help...

lianuo commented 5 years ago

Hi,I think I have found the reason, I can run the code again now, with

cuda 9.2 ubuntu16.04 python 3.6 pytorch 0.4.1

I just upgrade my GPU driver from 384 to Driver Version: 396.37

please try it, the problem may just because of old version of CUDA driver which have no functions which the code need to call.

huangbiubiu commented 5 years ago

Hi,I think I have found the reason, I can run the code again now, with

cuda 9.2 ubuntu16.04 python 3.6 pytorch 0.4.1

I just upgrade my GPU driver from 384 to Driver Version: 396.37

please try it, the problem may just because of old version of CUDA driver which have no functions which the code need to call.

I'm working on

I run the demo well on GTX 1080Ti, but it raise this error on Tesla K40c (same computer, just switch GPU with CUDA_VISIBLE_DEVICES). So I think maybe it's not caused by drivers?

huangbiubiu commented 5 years ago

Problem fixed. I think just add '-gencode', 'arch=compute_xx,code=sm_xx' to nvcc_args can fix the problem. If the problem still exists after adding the line, please check as follows:

ken0406zero commented 5 years ago

Problem fixed. I think just add '-gencode', 'arch=compute_xx,code=sm_xx' to nvcc_args can fix the problem. If the problem still exists after adding the line, please check as follows:

* Make sure you modified all 3 `setup.py` files in 3 packages: `channelnorm_package`, `correlation_package` and `resample2d_package`

* Make sure you have removed all intermedia files, including `__pycache__/`, `dist/`, `*.egg-info`, `build/`. Python will install with these files without recompiling if they have existed.

Can you tell me how to modify 3 setup.py files? Thank you.

huangbiubiu commented 5 years ago

@ken0406zero Add '-gencode', 'arch=compute_xx,code=sm_xx' to nvcc_args. nvcc_args like this:

nvcc_args = [
    '-gencode', 'arch=compute_30,code=sm_30',
    '-gencode', 'arch=compute_35,code=sm_35',
    '-gencode', 'arch=compute_37,code=sm_37',
    '-gencode', 'arch=compute_50,code=sm_50',
    '-gencode', 'arch=compute_52,code=sm_52',
    '-gencode', 'arch=compute_60,code=sm_60',
    '-gencode', 'arch=compute_61,code=sm_61',
    '-gencode', 'arch=compute_70,code=sm_70',
    '-gencode', 'arch=compute_xx,code=sm_xx'
]

'-gencode', 'arch=compute_xx,code=sm_xx' is what you added.

To determine what xx is, check https://developer.nvidia.com/cuda-gpus.

ken0406zero commented 5 years ago

Thank you so much @huangbiubiu

NazihaS commented 5 years ago

Please i have the same problem .... where can i find setup.py and the 3 packages: channelnorm_package, correlation_package and resample2d_package i couldn't find nvcc_args @huangbiubiu can you explain more by giving all the steps please. thanks in advance .

NazihaS commented 5 years ago

Please i have the same problem .... where can i find setup.py and the 3 packages: channelnorm_package, correlation_package and resample2d_package i couldn't find nvcc_args @ken0406zero can you explain more by giving all the steps please. thanks in advance .

Lanselott commented 5 years ago

Please i have the same problem .... where can i find setup.py and the 3 packages: channelnorm_package, correlation_package and resample2d_package i couldn't find nvcc_args @ken0406zero can you explain more by giving all the steps please. thanks in advance .

Hi @NazihaS ,did you solve this issue? This problem happened on my Titan V.

huangbiubiu commented 5 years ago

@NazihaS I think simply searching can find nvcc_args: https://github.com/NVIDIA/flownet2-pytorch/search?q=nvcc_args&unscoped_q=nvcc_args

JMarzz commented 5 years ago

@huangbiubiu 你好~请问一下 我现在用FlowNet2C跑inference没问题 但是用2 或者 2CSS都会出现no kernel 按照你说的添加gencode,清理了那几个文件也都不起作用 请问有没有其他解决方法呢? 另外配置是GeForce 960M 感觉应该不是显卡性能问题吧,,ubuntu18 pytorch 0.4.1 CUDA 9 感谢!

maximelianos commented 5 years ago

@ken0406zero Add '-gencode', 'arch=compute_xx,code=sm_xx' to nvcc_args. nvcc_args like this:

nvcc_args = [
    '-gencode', 'arch=compute_30,code=sm_30',
    '-gencode', 'arch=compute_35,code=sm_35',
    '-gencode', 'arch=compute_37,code=sm_37',
    '-gencode', 'arch=compute_50,code=sm_50',
    '-gencode', 'arch=compute_52,code=sm_52',
    '-gencode', 'arch=compute_60,code=sm_60',
    '-gencode', 'arch=compute_61,code=sm_61',
    '-gencode', 'arch=compute_70,code=sm_70',
    '-gencode', 'arch=compute_xx,code=sm_xx'
]

'-gencode', 'arch=compute_xx,code=sm_xx' is what you added.

To determine what xx is, check https://developer.nvidia.com/cuda-gpus.

Thank you for the solution and link! I tried launching this on Google Colab, and surprisingly found that Tesla K80 with CUDA 10.0! has Computing Capability 3.7, which is kind of old. I thought that higher CUDA versions correspond to higher computing capability. I'm not a video card architecture expert after all:) After adding the code generation line the error disappeared.

dilipv09 commented 4 years ago

I am getting an analogous error message when running inference using main.py. My system is somewhat different:

* CUDA V9.0.176

* cuDNN V7.3.0

* NVIDIA K80 GPU

* PyTorch 0.4.1.post2

* Python 3.6.5

* Ubuntu 16.04.4 LTS

The install and execution is dockerized and I'm happy to share the image if helpful.

# python main.py --inference --model FlowNet2 --save_flow --inference_dataset MpiSintelClean \
> --inference_dataset_root /data/MPISintel/training \
> --resume /workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar
Parsing Arguments
  [0.029s] batch_size: 8
  [0.030s] crop_size: [256, 256]
  [0.030s] fp16: False
  [0.030s] fp16_scale: 1024.0
  [0.030s] gradient_clip: None
  [0.030s] inference: True
  [0.030s] inference_batch_size: 1
  [0.030s] inference_dataset: MpiSintelClean
  [0.030s] inference_dataset_replicates: 1
  [0.030s] inference_dataset_root: /data/MPISintel/training
  [0.030s] inference_n_batches: -1
  [0.030s] inference_size: [-1, -1]
  [0.030s] log_frequency: 1
  [0.030s] loss: L1Loss
  [0.030s] model: FlowNet2
  [0.030s] model_batchNorm: False
  [0.030s] model_div_flow: 20.0
  [0.030s] name: run
  [0.030s] no_cuda: False
  [0.030s] number_gpus: 1
  [0.030s] number_workers: 8
  [0.030s] optimizer: Adam
  [0.030s] optimizer_amsgrad: False
  [0.030s] optimizer_betas: (0.9, 0.999)
  [0.030s] optimizer_eps: 1e-08
  [0.030s] optimizer_lr: 0.001
  [0.030s] optimizer_weight_decay: 0
  [0.030s] render_validation: False
  [0.030s] resume: /workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar
  [0.030s] rgb_max: 255.0
  [0.030s] save: ./work
  [0.030s] save_flow: True
  [0.030s] schedule_lr_fraction: 10
  [0.030s] schedule_lr_frequency: 0
  [0.030s] seed: 1
  [0.030s] skip_training: False
  [0.030s] skip_validation: False
  [0.030s] start_epoch: 1
  [0.030s] total_epochs: 10000
  [0.030s] train_n_batches: -1
  [0.030s] training_dataset: MpiSintelFinal
  [0.030s] training_dataset_replicates: 1
  [0.030s] training_dataset_root: ./MPI-Sintel/flow/training
  [0.030s] validation_dataset: MpiSintelClean
  [0.030s] validation_dataset_replicates: 1
  [0.030s] validation_dataset_root: ./MPI-Sintel/flow/training
  [0.030s] validation_frequency: 5
  [0.030s] validation_n_batches: -1
  [0.032s] Operation finished

Source Code
  Current Git Hash: b'532613d4fa46e544ddc309a8aa9e6b65dc91af21'

Initializing Datasets
  [0.050s] Inference Dataset: MpiSintelClean
  [0.117s] Inference Input: [3, 2, 384, 1024]
  [0.353s] Inference Targets: [2, 384, 1024]
  [0.354s] Operation finished

Building FlowNet2 model
  [5.002s] Effective Batch Size: 8
  [5.004s] Number of parameters: 162518834
  [5.004s] Initializing CUDA
  [7.209s] Parallelizing
  [7.211s] Loading checkpoint '/workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar'
  [7.670s] Loaded checkpoint '/workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar' (at epoch 0)
  [7.670s] Initializing save directory: ./work
  [7.672s] Operation finished

Initializing Adam Optimizer
  [0.001s] amsgrad = False (<class 'bool'>)
  [0.001s] weight_decay = 0 (<class 'int'>)
  [0.001s] eps = 1e-08 (<class 'float'>)
  [0.001s] betas = (0.9, 0.999) (<class 'tuple'>)
  [0.001s] lr = 0.001 (<class 'float'>)
  [0.001s] Operation finished

Overall Progress:   0%|                                                       |              0/1 [00:00<?, ?it/s]
Inferencing :   0%| ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).
error in correlation_forward_cuda_kernel: no kernel image is available for execution on the device
Traceback (most recent call last):
  File "main.py", line 403, in <module>
    stats = inference(args=args, epoch=epoch - 1, data_loader=inference_loader,              model=model_and_loss, offset=offset)
  File "main.py", line 367, in inference
    losses, output = model(data[0], target[0], inference=True)
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "main.py", line 170, in forward
    output = self.model(data)
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/workspace/flownet2-pytorch/models.py", line 118, in forward
    flownetc_flow2 = self.flownetc(x)[0]
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/workspace/flownet2-pytorch/networks/FlowNetC.py", line 86, in forward
    out_corr = self.corr(out_conv3a, out_conv3b) # False
  File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/workspace/flownet2-pytorch/networks/correlation_package/correlation.py", line 59, in forward
    result = CorrelationFunction(self.pad_size, self.kernel_size, self.max_displ             acement,self.stride1, self.stride2, self.corr_multiply)(input1, input2)
  File "/workspace/flownet2-pytorch/networks/correlation_package/correlation.py", line 27, in forward
    self.pad_size, self.kernel_size, self.max_displacement,self.stride1, self.stride2, self.corr_multiply)
RuntimeError: CUDA call failed (correlation_forward_cuda at correlation_cuda.cc:79)
frame #0: <unknown function> + 0x140c8 (0x7f6e498160c8 in /root/anaconda3/lib/py             thon3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_             cuda.cpython-36m-x86_64-linux-gnu.so)
frame #1: <unknown function> + 0x1430e (0x7f6e4981630e in /root/anaconda3/lib/py             thon3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_             cuda.cpython-36m-x86_64-linux-gnu.so)
frame #2: <unknown function> + 0x107b1 (0x7f6e498127b1 in /root/anaconda3/lib/py             thon3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_             cuda.cpython-36m-x86_64-linux-gnu.so)
frame #3: _PyCFunction_FastCallDict + 0x154 (0x55e4403a2b94 in ./work)
frame #4: <unknown function> + 0x19e67c (0x55e44043267c in ./work)
frame #5: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #6: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work)
frame #7: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #8: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #9: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #10: THPFunction_do_forward(THPFunction*, _object*) + 0x2ad (0x7f6e73b61fbd in /root/anaconda3/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #11: PyCFunction_Call + 0x5f (0x55e4403a598f in ./work)
frame #12: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #13: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work)
frame #14: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work)
frame #15: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work)
frame #16: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #17: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work)
frame #18: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #19: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #20: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #21: _PyEval_EvalFrameDefault + 0x1ab0 (0x55e440456470 in ./work)
frame #22: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work)
frame #23: _PyFunction_FastCallDict + 0x1bb (0x55e44042ce1b in ./work)
frame #24: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #25: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #26: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #27: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work)
frame #28: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work)
frame #29: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work)
frame #30: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #31: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work)
frame #32: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #33: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #34: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #35: _PyEval_EvalFrameDefault + 0x1ab0 (0x55e440456470 in ./work)
frame #36: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work)
frame #37: _PyFunction_FastCallDict + 0x1bb (0x55e44042ce1b in ./work)
frame #38: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #39: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #40: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #41: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work)
frame #42: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work)
frame #43: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work)
frame #44: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #45: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work)
frame #46: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #47: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #48: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #49: _PyEval_EvalFrameDefault + 0x1ab0 (0x55e440456470 in ./work)
frame #50: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work)
frame #51: _PyFunction_FastCallDict + 0x1bb (0x55e44042ce1b in ./work)
frame #52: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #53: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #54: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #55: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work)
frame #56: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work)
frame #57: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work)
frame #58: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #59: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work)
frame #60: _PyFunction_FastCallDict + 0x3db (0x55e44042d03b in ./work)
frame #61: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #62: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #63: PyObject_Call + 0x3e (0x55e4403a299e in ./work)

Exception ignored in: <bound method tqdm.__del__ of Overall Progress:   0%|                                                                    | 0/1 [00:01<?, ?it/s]>
Traceback (most recent call last):
  File "/root/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 889, in __del__
  File "/root/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 1095, in close
  File "/root/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 441, in _decr_instances
  File "/root/anaconda3/lib/python3.6/_weakrefset.py", line 109, in remove
KeyError: <weakref at 0x7f6e43cdc598; to 'tqdm' at 0x7f6e432fe400>
malcolmgarner-movement commented 4 years ago

Hi @huangbiubiu, thanks for your help on this issue!

I've tried what you recommended earlier, however I haven't had any luck. I am running on a Tesla K80, and from my understanding, that would mean that '-gencode', 'arch=compute_xx,code=sm_xx' line for me should look like '-gencode', 'arch=compute_37,code=sm_37' correct?

Let me know if I've misunderstood. Thanks again.

huangbiubiu commented 4 years ago

Hi @huangbiubiu, thanks for your help on this issue!

I've tried what you recommended earlier, however I haven't had any luck. I am running on a Tesla K80, and from my understanding, that would mean that '-gencode', 'arch=compute_xx,code=sm_xx' line for me should look like '-gencode', 'arch=compute_37,code=sm_37' correct?

Let me know if I've misunderstood. Thanks again.

Looks correct.

limacv commented 4 years ago

I met this problem too. I'm using python3.7, torch 1.4.0, RTX2060 in Windows. Just for reference, I solved the problem by removing all the stream when calling the cuda kernel, so that cuda just call the default stream all the time. I think my problem may be due to the latest pytorch are transferring at to c10.

cuuupid commented 4 years ago

@oblime out of curiosity does this fix also work on Torch 1.5? I'm still encountering the issue when working with the face model specifically

sssssyf commented 4 years ago

Problem fixed. I think just add '-gencode', 'arch=compute_xx,code=sm_xx' to nvcc_args can fix the problem. If the problem still exists after adding the line, please check as follows:

  • Make sure you modified all 3 setup.py files in 3 packages: channelnorm_package, correlation_package and resample2d_package
  • Make sure you have removed all intermedia files, including __pycache__/, dist/, *.egg-info, build/. Python will install with these files without recompiling if they have existed.

how can I find and delete files such __pycache__/, dist/, *.egg-info, `build/,because I can't python setup.py install again.I used to setup 3 files sucessflully,when I python setup.py install again error:command ' cl.exe' failed: No such file or directory

sssssyf commented 4 years ago

Problem fixed. I think just add '-gencode', 'arch=compute_xx,code=sm_xx' to nvcc_args can fix the problem. If the problem still exists after adding the line, please check as follows:

  • Make sure you modified all 3 setup.py files in 3 packages: channelnorm_package, correlation_package and resample2d_package
  • Make sure you have removed all intermedia files, including __pycache__/, dist/, *.egg-info, build/. Python will install with these files without recompiling if they have existed.

how can I find and delete files such __pycache__/, dist/, *.egg-info, `build/,because I can't python setup.py install again.I used to setup 3 files sucessflully,when I python setup.py install again error:command ' cl.exe' failed: No such file or directory

@huangbiubiu

sssssyf commented 4 years ago

how can I find and delete files such pycache/, dist/, *.egg-info, `build/,because I can't python setup.py install again.I used to setup 3 files sucessflully,when I python setup.py install again error:command ' cl.exe' failed: No such file or directory @huangbiubiu

jih189 commented 4 years ago

@sssssyf it could be in the correlation_package or where you run python setup.py install. you can just use fine -name to search the file.

huangruofei commented 4 years ago

这是要建立三个setup吗?没有setup文件怎么办

Gauravv97 commented 4 years ago

I encountered the same issue. It's most likely due to incompatible torch/cuda version. I found a combination that works on Colab. Here is the link to the notebook. Hopefully it helps.

tyrink commented 3 years ago

I encountered the same issue. It's most likely due to incompatible torch/cuda version. I found a combination that works on Colab. Here is the link to the notebook. Hopefully it helps.

It helps a lot, thanks so much!

hduxiao commented 2 years ago

@ken0406zero Add '-gencode', 'arch=compute_xx,code=sm_xx' to nvcc_args. nvcc_args like this:

nvcc_args = [
    '-gencode', 'arch=compute_30,code=sm_30',
    '-gencode', 'arch=compute_35,code=sm_35',
    '-gencode', 'arch=compute_37,code=sm_37',
    '-gencode', 'arch=compute_50,code=sm_50',
    '-gencode', 'arch=compute_52,code=sm_52',
    '-gencode', 'arch=compute_60,code=sm_60',
    '-gencode', 'arch=compute_61,code=sm_61',
    '-gencode', 'arch=compute_70,code=sm_70',
    '-gencode', 'arch=compute_xx,code=sm_xx'
]

'-gencode', 'arch=compute_xx,code=sm_xx' is what you added.

To determine what xx is, check https://developer.nvidia.com/cuda-gpus.

solve my problem, thanks