Open laaRaa opened 6 years ago
Also getting this, using the Flownet2 models in vid2vid.
error in correlation_forward_cuda_kernel: no kernel image is available for execution on the device
Traceback (most recent call last):
File "train.py", line 273, in <module>
train()
File "train.py", line 105, in train
flow_ref, conf_ref = flowNet(real_B, real_B_prev) # reference flows and confidences
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
result = self.forward(*input, **kwargs)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
result = self.forward(*input, **kwargs)
File "/home/ubuntu/vid2vid/models/flownet.py", line 33, in forward
flow, conf = self.compute_flow_and_conf(input_A, input_B)
File "/home/ubuntu/vid2vid/models/flownet.py", line 50, in compute_flow_and_conf
flow1 = self.flowNet(data1)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
result = self.forward(*input, **kwargs)
File "/home/ubuntu/vid2vid/models/flownet2_pytorch/models.py", line 126, in forward
flownetc_flow2 = self.flownetc(x)[0]
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
result = self.forward(*input, **kwargs)
File "/home/ubuntu/vid2vid/models/flownet2_pytorch/networks/FlowNetC.py", line 86, in forward
out_corr = self.corr(out_conv3a, out_conv3b) # False
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
result = self.forward(*input, **kwargs)
File "/home/ubuntu/vid2vid/models/flownet2_pytorch/networks/correlation_package/correlation.py", line 59, in forward
result = CorrelationFunction(self.pad_size, self.kernel_size, self.max_displacement,self.stride1, self.stride2, self.corr_multiply)(input1, input2)
File "/home/ubuntu/vid2vid/models/flownet2_pytorch/networks/correlation_package/correlation.py", line 27, in forward
self.pad_size, self.kernel_size, self.max_displacement,self.stride1, self.stride2, self.corr_multiply)
RuntimeError: CUDA call failed (correlation_forward_cuda at correlation_cuda.cc:79)
frame #0: <unknown function> + 0x140f8 (0x7f15885b40f8 in /home/ubuntu/.local/lib/python3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #1: <unknown function> + 0x1433e (0x7f15885b433e in /home/ubuntu/.local/lib/python3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #2: <unknown function> + 0x107e1 (0x7f15885b07e1 in /home/ubuntu/.local/lib/python3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_cuda.cpython-36m-x86_64-linux-gnu.so)
<omitting python frames>
frame #10: THPFunction_do_forward(THPFunction*, _object*) + 0x2ad (0x7f15c17a7f8d in /home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
I am getting an analogous error message when running inference using main.py
. My system is somewhat different:
The install and execution is dockerized and I'm happy to share the image if helpful.
# python main.py --inference --model FlowNet2 --save_flow --inference_dataset MpiSintelClean \
> --inference_dataset_root /data/MPISintel/training \
> --resume /workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar
Parsing Arguments
[0.029s] batch_size: 8
[0.030s] crop_size: [256, 256]
[0.030s] fp16: False
[0.030s] fp16_scale: 1024.0
[0.030s] gradient_clip: None
[0.030s] inference: True
[0.030s] inference_batch_size: 1
[0.030s] inference_dataset: MpiSintelClean
[0.030s] inference_dataset_replicates: 1
[0.030s] inference_dataset_root: /data/MPISintel/training
[0.030s] inference_n_batches: -1
[0.030s] inference_size: [-1, -1]
[0.030s] log_frequency: 1
[0.030s] loss: L1Loss
[0.030s] model: FlowNet2
[0.030s] model_batchNorm: False
[0.030s] model_div_flow: 20.0
[0.030s] name: run
[0.030s] no_cuda: False
[0.030s] number_gpus: 1
[0.030s] number_workers: 8
[0.030s] optimizer: Adam
[0.030s] optimizer_amsgrad: False
[0.030s] optimizer_betas: (0.9, 0.999)
[0.030s] optimizer_eps: 1e-08
[0.030s] optimizer_lr: 0.001
[0.030s] optimizer_weight_decay: 0
[0.030s] render_validation: False
[0.030s] resume: /workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar
[0.030s] rgb_max: 255.0
[0.030s] save: ./work
[0.030s] save_flow: True
[0.030s] schedule_lr_fraction: 10
[0.030s] schedule_lr_frequency: 0
[0.030s] seed: 1
[0.030s] skip_training: False
[0.030s] skip_validation: False
[0.030s] start_epoch: 1
[0.030s] total_epochs: 10000
[0.030s] train_n_batches: -1
[0.030s] training_dataset: MpiSintelFinal
[0.030s] training_dataset_replicates: 1
[0.030s] training_dataset_root: ./MPI-Sintel/flow/training
[0.030s] validation_dataset: MpiSintelClean
[0.030s] validation_dataset_replicates: 1
[0.030s] validation_dataset_root: ./MPI-Sintel/flow/training
[0.030s] validation_frequency: 5
[0.030s] validation_n_batches: -1
[0.032s] Operation finished
Source Code
Current Git Hash: b'532613d4fa46e544ddc309a8aa9e6b65dc91af21'
Initializing Datasets
[0.050s] Inference Dataset: MpiSintelClean
[0.117s] Inference Input: [3, 2, 384, 1024]
[0.353s] Inference Targets: [2, 384, 1024]
[0.354s] Operation finished
Building FlowNet2 model
[5.002s] Effective Batch Size: 8
[5.004s] Number of parameters: 162518834
[5.004s] Initializing CUDA
[7.209s] Parallelizing
[7.211s] Loading checkpoint '/workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar'
[7.670s] Loaded checkpoint '/workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar' (at epoch 0)
[7.670s] Initializing save directory: ./work
[7.672s] Operation finished
Initializing Adam Optimizer
[0.001s] amsgrad = False (<class 'bool'>)
[0.001s] weight_decay = 0 (<class 'int'>)
[0.001s] eps = 1e-08 (<class 'float'>)
[0.001s] betas = (0.9, 0.999) (<class 'tuple'>)
[0.001s] lr = 0.001 (<class 'float'>)
[0.001s] Operation finished
Overall Progress: 0%| | 0/1 [00:00<?, ?it/s]
Inferencing : 0%| ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).
error in correlation_forward_cuda_kernel: no kernel image is available for execution on the device
Traceback (most recent call last):
File "main.py", line 403, in <module>
stats = inference(args=args, epoch=epoch - 1, data_loader=inference_loader, model=model_and_loss, offset=offset)
File "main.py", line 367, in inference
losses, output = model(data[0], target[0], inference=True)
File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
result = self.forward(*input, **kwargs)
File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
return self.module(*inputs[0], **kwargs[0])
File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
result = self.forward(*input, **kwargs)
File "main.py", line 170, in forward
output = self.model(data)
File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
result = self.forward(*input, **kwargs)
File "/workspace/flownet2-pytorch/models.py", line 118, in forward
flownetc_flow2 = self.flownetc(x)[0]
File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
result = self.forward(*input, **kwargs)
File "/workspace/flownet2-pytorch/networks/FlowNetC.py", line 86, in forward
out_corr = self.corr(out_conv3a, out_conv3b) # False
File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
result = self.forward(*input, **kwargs)
File "/workspace/flownet2-pytorch/networks/correlation_package/correlation.py", line 59, in forward
result = CorrelationFunction(self.pad_size, self.kernel_size, self.max_displ acement,self.stride1, self.stride2, self.corr_multiply)(input1, input2)
File "/workspace/flownet2-pytorch/networks/correlation_package/correlation.py", line 27, in forward
self.pad_size, self.kernel_size, self.max_displacement,self.stride1, self.stride2, self.corr_multiply)
RuntimeError: CUDA call failed (correlation_forward_cuda at correlation_cuda.cc:79)
frame #0: <unknown function> + 0x140c8 (0x7f6e498160c8 in /root/anaconda3/lib/py thon3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_ cuda.cpython-36m-x86_64-linux-gnu.so)
frame #1: <unknown function> + 0x1430e (0x7f6e4981630e in /root/anaconda3/lib/py thon3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_ cuda.cpython-36m-x86_64-linux-gnu.so)
frame #2: <unknown function> + 0x107b1 (0x7f6e498127b1 in /root/anaconda3/lib/py thon3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_ cuda.cpython-36m-x86_64-linux-gnu.so)
frame #3: _PyCFunction_FastCallDict + 0x154 (0x55e4403a2b94 in ./work)
frame #4: <unknown function> + 0x19e67c (0x55e44043267c in ./work)
frame #5: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #6: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work)
frame #7: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #8: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #9: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #10: THPFunction_do_forward(THPFunction*, _object*) + 0x2ad (0x7f6e73b61fbd in /root/anaconda3/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #11: PyCFunction_Call + 0x5f (0x55e4403a598f in ./work)
frame #12: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #13: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work)
frame #14: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work)
frame #15: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work)
frame #16: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #17: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work)
frame #18: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #19: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #20: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #21: _PyEval_EvalFrameDefault + 0x1ab0 (0x55e440456470 in ./work)
frame #22: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work)
frame #23: _PyFunction_FastCallDict + 0x1bb (0x55e44042ce1b in ./work)
frame #24: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #25: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #26: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #27: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work)
frame #28: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work)
frame #29: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work)
frame #30: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #31: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work)
frame #32: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #33: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #34: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #35: _PyEval_EvalFrameDefault + 0x1ab0 (0x55e440456470 in ./work)
frame #36: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work)
frame #37: _PyFunction_FastCallDict + 0x1bb (0x55e44042ce1b in ./work)
frame #38: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #39: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #40: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #41: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work)
frame #42: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work)
frame #43: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work)
frame #44: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #45: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work)
frame #46: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #47: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #48: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #49: _PyEval_EvalFrameDefault + 0x1ab0 (0x55e440456470 in ./work)
frame #50: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work)
frame #51: _PyFunction_FastCallDict + 0x1bb (0x55e44042ce1b in ./work)
frame #52: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #53: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #54: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
frame #55: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work)
frame #56: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work)
frame #57: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work)
frame #58: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work)
frame #59: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work)
frame #60: _PyFunction_FastCallDict + 0x3db (0x55e44042d03b in ./work)
frame #61: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work)
frame #62: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work)
frame #63: PyObject_Call + 0x3e (0x55e4403a299e in ./work)
Exception ignored in: <bound method tqdm.__del__ of Overall Progress: 0%| | 0/1 [00:01<?, ?it/s]>
Traceback (most recent call last):
File "/root/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 889, in __del__
File "/root/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 1095, in close
File "/root/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 441, in _decr_instances
File "/root/anaconda3/lib/python3.6/_weakrefset.py", line 109, in remove
KeyError: <weakref at 0x7f6e43cdc598; to 'tqdm' at 0x7f6e432fe400>
Is this possibly an issue with K80 GPUs? I also have a K80 and can't get past this issue
Hi,
I couldn’t solve this error neither (I don't have a K80 GPU but a Quadro K6000). Instead I installed an older version of this code (the one before the commit of the 22nd of August) and had to change the three make.sh scripts as explained in the issue #33 opened on the 25 of January. In my case -arch=sm_30 worked (instead of -arch=sm_52). In the current code these make.sh files are replaced by setup.py. I also had to use PyTorch 0.4.0 and Python 3.6.5.
Hope it helps!
Hi, Any updates to this issue? I'm also getting the same error on Google Colab, and don't know how to get past it.
I haven't gotten any of the workarounds posted to work so far on K80 gpus, and as far as I know there aren't updates to this (but I would love to be proved wrong!).
I originally thought it referred to not being able to find one of the CUDA libraries, but no amount of fresh installs has fixed this. What is your environment?
python 3 CUDA nvcc 9.2.148 pytorch torch-0.4.1 The OS on google Colab is Ubuntu @pshah123 did you end up running it on another GPU?
python 3 CUDA nvcc 9.2.148 pytorch torch-0.4.1 The OS on google Colab is Ubuntu @pshah123 did you end up running it on another GPU?
@yuanzhou15 Did you make it to work on Google Colab? I am facing the same issue.
@pshah123 No I'm still facing the same issues
It is weird , I do not have this problem before, and successfully run the demo.(python 3.6 ubuntu 16.04 cuda 9.2 pytorch 0.4.1 gtx1080) but After I change cuda from 9.2 to 9.0 the problem happend.... Hope this information could help...
Hi,I think I have found the reason, I can run the code again now, with
cuda 9.2 ubuntu16.04 python 3.6 pytorch 0.4.1
I just upgrade my GPU driver from 384 to Driver Version: 396.37
please try it, the problem may just because of old version of CUDA driver which have no functions which the code need to call.
Hi,I think I have found the reason, I can run the code again now, with
cuda 9.2 ubuntu16.04 python 3.6 pytorch 0.4.1
I just upgrade my GPU driver from 384 to Driver Version: 396.37
please try it, the problem may just because of old version of CUDA driver which have no functions which the code need to call.
I'm working on
I run the demo well on GTX 1080Ti, but it raise this error on Tesla K40c (same computer, just switch GPU with CUDA_VISIBLE_DEVICES
). So I think maybe it's not caused by drivers?
Problem fixed.
I think just add '-gencode', 'arch=compute_xx,code=sm_xx'
to nvcc_args
can fix the problem. If the problem still exists after adding the line, please check as follows:
setup.py
files in 3 packages: channelnorm_package
, correlation_package
and resample2d_package
__pycache__/
, dist/
, *.egg-info
, build/
. Python will install with these files without recompiling if they have existed.Problem fixed. I think just add
'-gencode', 'arch=compute_xx,code=sm_xx'
tonvcc_args
can fix the problem. If the problem still exists after adding the line, please check as follows:* Make sure you modified all 3 `setup.py` files in 3 packages: `channelnorm_package`, `correlation_package` and `resample2d_package` * Make sure you have removed all intermedia files, including `__pycache__/`, `dist/`, `*.egg-info`, `build/`. Python will install with these files without recompiling if they have existed.
Can you tell me how to modify 3 setup.py files? Thank you.
@ken0406zero Add '-gencode', 'arch=compute_xx,code=sm_xx' to nvcc_args
. nvcc_args
like this:
nvcc_args = [
'-gencode', 'arch=compute_30,code=sm_30',
'-gencode', 'arch=compute_35,code=sm_35',
'-gencode', 'arch=compute_37,code=sm_37',
'-gencode', 'arch=compute_50,code=sm_50',
'-gencode', 'arch=compute_52,code=sm_52',
'-gencode', 'arch=compute_60,code=sm_60',
'-gencode', 'arch=compute_61,code=sm_61',
'-gencode', 'arch=compute_70,code=sm_70',
'-gencode', 'arch=compute_xx,code=sm_xx'
]
'-gencode', 'arch=compute_xx,code=sm_xx'
is what you added.
To determine what xx
is, check https://developer.nvidia.com/cuda-gpus.
Thank you so much @huangbiubiu
Please i have the same problem ....
where can i find setup.py and the 3 packages: channelnorm_package
, correlation_package
and resample2d_package
i couldn't find nvcc_args @huangbiubiu can you explain more by giving all the steps please. thanks in advance .
Please i have the same problem .... where can i find setup.py and the 3 packages: channelnorm_package, correlation_package and resample2d_package i couldn't find nvcc_args @ken0406zero can you explain more by giving all the steps please. thanks in advance .
Please i have the same problem .... where can i find setup.py and the 3 packages: channelnorm_package, correlation_package and resample2d_package i couldn't find nvcc_args @ken0406zero can you explain more by giving all the steps please. thanks in advance .
Hi @NazihaS ,did you solve this issue? This problem happened on my Titan V.
@NazihaS I think simply searching can find nvcc_args
:
https://github.com/NVIDIA/flownet2-pytorch/search?q=nvcc_args&unscoped_q=nvcc_args
@huangbiubiu 你好~请问一下 我现在用FlowNet2C跑inference没问题 但是用2 或者 2CSS都会出现no kernel 按照你说的添加gencode,清理了那几个文件也都不起作用 请问有没有其他解决方法呢? 另外配置是GeForce 960M 感觉应该不是显卡性能问题吧,,ubuntu18 pytorch 0.4.1 CUDA 9 感谢!
@ken0406zero Add '-gencode', 'arch=compute_xx,code=sm_xx' to
nvcc_args
.nvcc_args
like this:nvcc_args = [ '-gencode', 'arch=compute_30,code=sm_30', '-gencode', 'arch=compute_35,code=sm_35', '-gencode', 'arch=compute_37,code=sm_37', '-gencode', 'arch=compute_50,code=sm_50', '-gencode', 'arch=compute_52,code=sm_52', '-gencode', 'arch=compute_60,code=sm_60', '-gencode', 'arch=compute_61,code=sm_61', '-gencode', 'arch=compute_70,code=sm_70', '-gencode', 'arch=compute_xx,code=sm_xx' ]
'-gencode', 'arch=compute_xx,code=sm_xx'
is what you added.To determine what
xx
is, check https://developer.nvidia.com/cuda-gpus.
Thank you for the solution and link! I tried launching this on Google Colab, and surprisingly found that Tesla K80 with CUDA 10.0! has Computing Capability 3.7, which is kind of old. I thought that higher CUDA versions correspond to higher computing capability. I'm not a video card architecture expert after all:) After adding the code generation line the error disappeared.
I am getting an analogous error message when running inference using
main.py
. My system is somewhat different:* CUDA V9.0.176 * cuDNN V7.3.0 * NVIDIA K80 GPU * PyTorch 0.4.1.post2 * Python 3.6.5 * Ubuntu 16.04.4 LTS
The install and execution is dockerized and I'm happy to share the image if helpful.
# python main.py --inference --model FlowNet2 --save_flow --inference_dataset MpiSintelClean \ > --inference_dataset_root /data/MPISintel/training \ > --resume /workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar Parsing Arguments [0.029s] batch_size: 8 [0.030s] crop_size: [256, 256] [0.030s] fp16: False [0.030s] fp16_scale: 1024.0 [0.030s] gradient_clip: None [0.030s] inference: True [0.030s] inference_batch_size: 1 [0.030s] inference_dataset: MpiSintelClean [0.030s] inference_dataset_replicates: 1 [0.030s] inference_dataset_root: /data/MPISintel/training [0.030s] inference_n_batches: -1 [0.030s] inference_size: [-1, -1] [0.030s] log_frequency: 1 [0.030s] loss: L1Loss [0.030s] model: FlowNet2 [0.030s] model_batchNorm: False [0.030s] model_div_flow: 20.0 [0.030s] name: run [0.030s] no_cuda: False [0.030s] number_gpus: 1 [0.030s] number_workers: 8 [0.030s] optimizer: Adam [0.030s] optimizer_amsgrad: False [0.030s] optimizer_betas: (0.9, 0.999) [0.030s] optimizer_eps: 1e-08 [0.030s] optimizer_lr: 0.001 [0.030s] optimizer_weight_decay: 0 [0.030s] render_validation: False [0.030s] resume: /workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar [0.030s] rgb_max: 255.0 [0.030s] save: ./work [0.030s] save_flow: True [0.030s] schedule_lr_fraction: 10 [0.030s] schedule_lr_frequency: 0 [0.030s] seed: 1 [0.030s] skip_training: False [0.030s] skip_validation: False [0.030s] start_epoch: 1 [0.030s] total_epochs: 10000 [0.030s] train_n_batches: -1 [0.030s] training_dataset: MpiSintelFinal [0.030s] training_dataset_replicates: 1 [0.030s] training_dataset_root: ./MPI-Sintel/flow/training [0.030s] validation_dataset: MpiSintelClean [0.030s] validation_dataset_replicates: 1 [0.030s] validation_dataset_root: ./MPI-Sintel/flow/training [0.030s] validation_frequency: 5 [0.030s] validation_n_batches: -1 [0.032s] Operation finished Source Code Current Git Hash: b'532613d4fa46e544ddc309a8aa9e6b65dc91af21' Initializing Datasets [0.050s] Inference Dataset: MpiSintelClean [0.117s] Inference Input: [3, 2, 384, 1024] [0.353s] Inference Targets: [2, 384, 1024] [0.354s] Operation finished Building FlowNet2 model [5.002s] Effective Batch Size: 8 [5.004s] Number of parameters: 162518834 [5.004s] Initializing CUDA [7.209s] Parallelizing [7.211s] Loading checkpoint '/workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar' [7.670s] Loaded checkpoint '/workspace/flownet2-pytorch/FlowNet2_checkpoint.pth.tar' (at epoch 0) [7.670s] Initializing save directory: ./work [7.672s] Operation finished Initializing Adam Optimizer [0.001s] amsgrad = False (<class 'bool'>) [0.001s] weight_decay = 0 (<class 'int'>) [0.001s] eps = 1e-08 (<class 'float'>) [0.001s] betas = (0.9, 0.999) (<class 'tuple'>) [0.001s] lr = 0.001 (<class 'float'>) [0.001s] Operation finished Overall Progress: 0%| | 0/1 [00:00<?, ?it/s] Inferencing : 0%| ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm). error in correlation_forward_cuda_kernel: no kernel image is available for execution on the device Traceback (most recent call last): File "main.py", line 403, in <module> stats = inference(args=args, epoch=epoch - 1, data_loader=inference_loader, model=model_and_loss, offset=offset) File "main.py", line 367, in inference losses, output = model(data[0], target[0], inference=True) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__ result = self.forward(*input, **kwargs) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward return self.module(*inputs[0], **kwargs[0]) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__ result = self.forward(*input, **kwargs) File "main.py", line 170, in forward output = self.model(data) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__ result = self.forward(*input, **kwargs) File "/workspace/flownet2-pytorch/models.py", line 118, in forward flownetc_flow2 = self.flownetc(x)[0] File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__ result = self.forward(*input, **kwargs) File "/workspace/flownet2-pytorch/networks/FlowNetC.py", line 86, in forward out_corr = self.corr(out_conv3a, out_conv3b) # False File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__ result = self.forward(*input, **kwargs) File "/workspace/flownet2-pytorch/networks/correlation_package/correlation.py", line 59, in forward result = CorrelationFunction(self.pad_size, self.kernel_size, self.max_displ acement,self.stride1, self.stride2, self.corr_multiply)(input1, input2) File "/workspace/flownet2-pytorch/networks/correlation_package/correlation.py", line 27, in forward self.pad_size, self.kernel_size, self.max_displacement,self.stride1, self.stride2, self.corr_multiply) RuntimeError: CUDA call failed (correlation_forward_cuda at correlation_cuda.cc:79) frame #0: <unknown function> + 0x140c8 (0x7f6e498160c8 in /root/anaconda3/lib/py thon3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_ cuda.cpython-36m-x86_64-linux-gnu.so) frame #1: <unknown function> + 0x1430e (0x7f6e4981630e in /root/anaconda3/lib/py thon3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_ cuda.cpython-36m-x86_64-linux-gnu.so) frame #2: <unknown function> + 0x107b1 (0x7f6e498127b1 in /root/anaconda3/lib/py thon3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_ cuda.cpython-36m-x86_64-linux-gnu.so) frame #3: _PyCFunction_FastCallDict + 0x154 (0x55e4403a2b94 in ./work) frame #4: <unknown function> + 0x19e67c (0x55e44043267c in ./work) frame #5: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work) frame #6: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work) frame #7: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work) frame #8: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work) frame #9: PyObject_Call + 0x3e (0x55e4403a299e in ./work) frame #10: THPFunction_do_forward(THPFunction*, _object*) + 0x2ad (0x7f6e73b61fbd in /root/anaconda3/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so) frame #11: PyCFunction_Call + 0x5f (0x55e4403a598f in ./work) frame #12: PyObject_Call + 0x3e (0x55e4403a299e in ./work) frame #13: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work) frame #14: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work) frame #15: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work) frame #16: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work) frame #17: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work) frame #18: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work) frame #19: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work) frame #20: PyObject_Call + 0x3e (0x55e4403a299e in ./work) frame #21: _PyEval_EvalFrameDefault + 0x1ab0 (0x55e440456470 in ./work) frame #22: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work) frame #23: _PyFunction_FastCallDict + 0x1bb (0x55e44042ce1b in ./work) frame #24: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work) frame #25: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work) frame #26: PyObject_Call + 0x3e (0x55e4403a299e in ./work) frame #27: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work) frame #28: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work) frame #29: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work) frame #30: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work) frame #31: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work) frame #32: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work) frame #33: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work) frame #34: PyObject_Call + 0x3e (0x55e4403a299e in ./work) frame #35: _PyEval_EvalFrameDefault + 0x1ab0 (0x55e440456470 in ./work) frame #36: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work) frame #37: _PyFunction_FastCallDict + 0x1bb (0x55e44042ce1b in ./work) frame #38: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work) frame #39: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work) frame #40: PyObject_Call + 0x3e (0x55e4403a299e in ./work) frame #41: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work) frame #42: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work) frame #43: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work) frame #44: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work) frame #45: _PyFunction_FastCallDict + 0x11b (0x55e44042cd7b in ./work) frame #46: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work) frame #47: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work) frame #48: PyObject_Call + 0x3e (0x55e4403a299e in ./work) frame #49: _PyEval_EvalFrameDefault + 0x1ab0 (0x55e440456470 in ./work) frame #50: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work) frame #51: _PyFunction_FastCallDict + 0x1bb (0x55e44042ce1b in ./work) frame #52: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work) frame #53: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work) frame #54: PyObject_Call + 0x3e (0x55e4403a299e in ./work) frame #55: <unknown function> + 0x16b9b7 (0x55e4403ff9b7 in ./work) frame #56: _PyObject_FastCallDict + 0x8b (0x55e4403a2d7b in ./work) frame #57: <unknown function> + 0x19e7ce (0x55e4404327ce in ./work) frame #58: _PyEval_EvalFrameDefault + 0x2fa (0x55e440454cba in ./work) frame #59: <unknown function> + 0x197a94 (0x55e44042ba94 in ./work) frame #60: _PyFunction_FastCallDict + 0x3db (0x55e44042d03b in ./work) frame #61: _PyObject_FastCallDict + 0x26f (0x55e4403a2f5f in ./work) frame #62: _PyObject_Call_Prepend + 0x63 (0x55e4403a7a03 in ./work) frame #63: PyObject_Call + 0x3e (0x55e4403a299e in ./work) Exception ignored in: <bound method tqdm.__del__ of Overall Progress: 0%| | 0/1 [00:01<?, ?it/s]> Traceback (most recent call last): File "/root/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 889, in __del__ File "/root/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 1095, in close File "/root/anaconda3/lib/python3.6/site-packages/tqdm/_tqdm.py", line 441, in _decr_instances File "/root/anaconda3/lib/python3.6/_weakrefset.py", line 109, in remove KeyError: <weakref at 0x7f6e43cdc598; to 'tqdm' at 0x7f6e432fe400>
Hi @huangbiubiu, thanks for your help on this issue!
I've tried what you recommended earlier, however I haven't had any luck. I am running on a Tesla K80, and from my understanding, that would mean that '-gencode', 'arch=compute_xx,code=sm_xx'
line for me should look like '-gencode', 'arch=compute_37,code=sm_37'
correct?
Let me know if I've misunderstood. Thanks again.
Hi @huangbiubiu, thanks for your help on this issue!
I've tried what you recommended earlier, however I haven't had any luck. I am running on a Tesla K80, and from my understanding, that would mean that
'-gencode', 'arch=compute_xx,code=sm_xx'
line for me should look like'-gencode', 'arch=compute_37,code=sm_37'
correct?Let me know if I've misunderstood. Thanks again.
Looks correct.
I met this problem too. I'm using python3.7, torch 1.4.0, RTX2060 in Windows. Just for reference, I solved the problem by removing all the stream when calling the cuda kernel, so that cuda just call the default stream all the time. I think my problem may be due to the latest pytorch are transferring at to c10.
@oblime out of curiosity does this fix also work on Torch 1.5? I'm still encountering the issue when working with the face model specifically
Problem fixed. I think just add
'-gencode', 'arch=compute_xx,code=sm_xx'
tonvcc_args
can fix the problem. If the problem still exists after adding the line, please check as follows:
- Make sure you modified all 3
setup.py
files in 3 packages:channelnorm_package
,correlation_package
andresample2d_package
- Make sure you have removed all intermedia files, including
__pycache__/
,dist/
,*.egg-info
,build/
. Python will install with these files without recompiling if they have existed.
how can I find and delete files such __pycache__/
, dist/
, *.egg-info
, `build/,because I can't python setup.py install again.I used to setup 3 files sucessflully,when I python setup.py install again error:command ' cl.exe' failed: No such file or directory
Problem fixed. I think just add
'-gencode', 'arch=compute_xx,code=sm_xx'
tonvcc_args
can fix the problem. If the problem still exists after adding the line, please check as follows:
- Make sure you modified all 3
setup.py
files in 3 packages:channelnorm_package
,correlation_package
andresample2d_package
- Make sure you have removed all intermedia files, including
__pycache__/
,dist/
,*.egg-info
,build/
. Python will install with these files without recompiling if they have existed.how can I find and delete files such
__pycache__/
,dist/
,*.egg-info
, `build/,because I can't python setup.py install again.I used to setup 3 files sucessflully,when I python setup.py install again error:command ' cl.exe' failed: No such file or directory
@huangbiubiu
how can I find and delete files such pycache/, dist/, *.egg-info, `build/,because I can't python setup.py install again.I used to setup 3 files sucessflully,when I python setup.py install again error:command ' cl.exe' failed: No such file or directory @huangbiubiu
@sssssyf it could be in the correlation_package or where you run python setup.py install. you can just use fine -name to search the file.
这是要建立三个setup吗?没有setup文件怎么办
I encountered the same issue. It's most likely due to incompatible torch/cuda version. I found a combination that works on Colab. Here is the link to the notebook. Hopefully it helps.
I encountered the same issue. It's most likely due to incompatible torch/cuda version. I found a combination that works on Colab. Here is the link to the notebook. Hopefully it helps.
It helps a lot, thanks so much!
@ken0406zero Add '-gencode', 'arch=compute_xx,code=sm_xx' to
nvcc_args
.nvcc_args
like this:nvcc_args = [ '-gencode', 'arch=compute_30,code=sm_30', '-gencode', 'arch=compute_35,code=sm_35', '-gencode', 'arch=compute_37,code=sm_37', '-gencode', 'arch=compute_50,code=sm_50', '-gencode', 'arch=compute_52,code=sm_52', '-gencode', 'arch=compute_60,code=sm_60', '-gencode', 'arch=compute_61,code=sm_61', '-gencode', 'arch=compute_70,code=sm_70', '-gencode', 'arch=compute_xx,code=sm_xx' ]
'-gencode', 'arch=compute_xx,code=sm_xx'
is what you added.To determine what
xx
is, check https://developer.nvidia.com/cuda-gpus.
solve my problem, thanks
Hi,
I installed all the dependencies and followed the steps that are listed to install and run flownet but I have the following error when using "run_a_pair.py" . Any ideas?
The parameters of the system are the following: CUDA: release 9.1, V9.1.85 PYTORCH: 0.4.1 UBUNTU: Ubuntu 18.04 LTS PYTHON: 3.6.5
Thanks!
python3 run_a_pair.py error in correlation_forward_cuda_kernel: no kernel image is available for execution on the device Traceback (most recent call last): File "run_a_pair.py", line 31, in
result = net(im).squeeze()
File "/home/raad/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, kwargs)
File "/home/raad/flownet2-pytorch_python3/models.py", line 118, in forward
flownetc_flow2 = self.flownetc(x)[0]
File "/home/raad/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, *kwargs)
File "/home/raad/flownet2-pytorch_python3/networks/FlowNetC.py", line 86, in forward
out_corr = self.corr(out_conv3a, out_conv3b) # False
File "/home/raad/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(input, kwargs)
File "/home/raad/flownet2-pytorch_python3/networks/correlation_package/correlation.py", line 59, in forward
result = CorrelationFunction(self.pad_size, self.kernel_size, self.max_displacement,self.stride1, self.stride2, self.corr_multiply)(input1, input2)
File "/home/raad/flownet2-pytorch_python3/networks/correlation_package/correlation.py", line 27, in forward
self.pad_size, self.kernel_size, self.max_displacement,self.stride1, self.stride2, self.corr_multiply)
RuntimeError: CUDA call failed (correlation_forward_cuda at correlation_cuda.cc:79)
frame #0: + 0x135a7 (0x7f64fe8845a7 in /home/raad/local/lib/python3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #1: + 0x102ef (0x7f64fe8812ef in /home/raad/local/lib/python3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #2: _PyCFunction_FastCallKeywords + 0x26b (0x4c549b in python3)
frame #3: python3() [0x54ffe4]
frame #4: _PyEval_EvalFrameDefault + 0x362f (0x5546cf in python3)
frame #5: python3() [0x54f0e8]
frame #6: _PyFunction_FastCallDict + 0x2a2 (0x558ef2 in python3)
frame #7: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3)
frame #8: PyObject_Call + 0x3e (0x459eee in python3)
frame #9: THPFunction_do_forward(THPFunction, _object) + 0x2ad (0x7f652c26fc3d in /home/raad/.local/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #10: PyCFunction_Call + 0xbd (0x4c517d in python3)
frame #11: PyObject_Call + 0x3e (0x459eee in python3)
frame #12: python3() [0x4e0e9b]
frame #13: _PyObject_FastCallDict + 0xa3 (0x45a0e3 in python3)
frame #14: python3() [0x54fd37]
frame #15: _PyEval_EvalFrameDefault + 0x362f (0x5546cf in python3)
frame #16: python3() [0x54f0e8]
frame #17: _PyFunction_FastCallDict + 0x2a2 (0x558ef2 in python3)
frame #18: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3)
frame #19: PyObject_Call + 0x3e (0x459eee in python3)
frame #20: _PyEval_EvalFrameDefault + 0x1ba9 (0x552c49 in python3)
frame #21: python3() [0x54fbe1]
frame #22: _PyFunction_FastCallDict + 0x1c9 (0x558e19 in python3)
frame #23: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3)
frame #24: PyObject_Call + 0x3e (0x459eee in python3)
frame #25: python3() [0x4e0e9b]
frame #26: _PyObject_FastCallDict + 0xa3 (0x45a0e3 in python3)
frame #27: python3() [0x54fd37]
frame #28: _PyEval_EvalFrameDefault + 0x362f (0x5546cf in python3)
frame #29: python3() [0x54f0e8]
frame #30: _PyFunction_FastCallDict + 0x2a2 (0x558ef2 in python3)
frame #31: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3)
frame #32: PyObject_Call + 0x3e (0x459eee in python3)
frame #33: _PyEval_EvalFrameDefault + 0x1ba9 (0x552c49 in python3)
frame #34: python3() [0x54fbe1]
frame #35: _PyFunction_FastCallDict + 0x1c9 (0x558e19 in python3)
frame #36: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3)
frame #37: PyObject_Call + 0x3e (0x459eee in python3)
frame #38: python3() [0x4e0e9b]
frame #39: _PyObject_FastCallDict + 0xa3 (0x45a0e3 in python3)
frame #40: python3() [0x54fd37]
frame #41: _PyEval_EvalFrameDefault + 0x362f (0x5546cf in python3)
frame #42: python3() [0x54f0e8]
frame #43: _PyFunction_FastCallDict + 0x2a2 (0x558ef2 in python3)
frame #44: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3)
frame #45: PyObject_Call + 0x3e (0x459eee in python3)
frame #46: _PyEval_EvalFrameDefault + 0x1ba9 (0x552c49 in python3)
frame #47: python3() [0x54fbe1]
frame #48: _PyFunction_FastCallDict + 0x1c9 (0x558e19 in python3)
frame #49: _PyObject_Call_Prepend + 0x231 (0x45a461 in python3)
frame #50: PyObject_Call + 0x3e (0x459eee in python3)
frame #51: python3() [0x4e0e9b]
frame #52: _PyObject_FastCallDict + 0xa3 (0x45a0e3 in python3)
frame #53: python3() [0x54fd37]
frame #54: _PyEval_EvalFrameDefault + 0x362f (0x5546cf in python3)
frame #55: python3() [0x54fbe1]
frame #56: PyEval_EvalCode + 0x23 (0x550b93 in python3)
frame #57: PyRun_FileExFlags + 0x169 (0x42b519 in python3)
frame #58: PyRun_SimpleFileExFlags + 0xe5 (0x42b705 in python3)
frame #59: Py_Main + 0xccb (0x441fcb in python3)
frame #60: main + 0x184 (0x421ff4 in python3)
frame #61: __libc_start_main + 0xe7 (0x7f65408fab97 in /lib/x86_64-linux-gnu/libc.so.6)
frame #62: _start + 0x2a (0x4220aa in python3)