NVlabs / few-shot-vid2vid

Pytorch implementation for few-shot photorealistic video-to-video translation.
Other
1.8k stars 275 forks source link

error in correlation_forward_cuda_kernel: invalid device function (URGENT HELP REQ) #27

Closed jalees018 closed 4 years ago

jalees018 commented 4 years ago

Please help me, its urgent. My CUDA version is 10.1 , pytorch 1.3.1 , python 3.6.9 .Whenever I try to run the train script it returns the following error:

error in correlation_forward_cuda_kernel: invalid device function Traceback (most recent call last): File "train.py", line 73, in train() File "train.py", line 45, in train flow_gt, conf_gt = flowNet(data_list, epoch) File "/HPS/DFD_19/work/anaconda3/envs/myenv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(*input, kwargs) File "/HPS/DFD_19/work/few-shot-vid2vid/models/models.py", line 90, in forward outputs = self.model(*inputs, *kwargs, dummy_bs=self.pad_bs) File "/HPS/DFD_19/work/anaconda3/envs/myenv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(input, kwargs) File "/HPS/DFD_19/work/anaconda3/envs/myenv/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward return self.module(*inputs[0], kwargs[0]) File "/HPS/DFD_19/work/anaconda3/envs/myenv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(*input, *kwargs) File "/HPS/DFD_19/work/few-shot-vid2vid/models/flownet.py", line 48, in forward flow_gt_ref, conf_gt_ref = self.flowNet_forward(image_now, image_ref.expand_as(image_now)) File "/HPS/DFD_19/work/few-shot-vid2vid/models/flownet.py", line 60, in flowNet_forward flow, conf = self.compute_flow_and_conf(input_A, input_B) File "/HPS/DFD_19/work/few-shot-vid2vid/models/flownet.py", line 75, in compute_flow_and_conf flow1 = self.flowNet(data1) File "/HPS/DFD_19/work/anaconda3/envs/myenv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(input, kwargs) File "/HPS/DFD_19/work/few-shot-vid2vid/models/networks/flownet2_pytorch/models.py", line 126, in forward flownetc_flow2 = self.flownetc(x)[0] File "/HPS/DFD_19/work/anaconda3/envs/myenv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(*input, *kwargs) File "/HPS/DFD_19/work/few-shot-vid2vid/models/networks/flownet2_pytorch/networks/FlowNetC.py", line 86, in forward out_corr = self.corr(out_conv3a, out_conv3b) # False File "/HPS/DFD_19/work/anaconda3/envs/myenv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(input, **kwargs) File "/HPS/DFD_19/work/few-shot-vid2vid/models/networks/flownet2_pytorch/networks/correlation_package/correlation.py", line 69, in forward self.stride1, self.stride2, self.corr_multiply) File "/HPS/DFD_19/work/few-shot-vid2vid/models/networks/flownet2_pytorch/networks/correlation_package/correlation.py", line 34, in forward ctx.pad_size, ctx.kernel_size, ctx.max_displacement,ctx.stride1, ctx.stride2, ctx.corr_multiply) RuntimeError: CUDA call failed (correlation_forward_cuda at correlation_cuda.cc:82) frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x47 (0x7f7e28ef2687 in /HPS/DFD_19/work/anaconda3/envs/myenv/lib/python3.6/site-packages/torch/lib/libc10.so) frame #1: correlation_forward_cuda(at::Tensor&, at::Tensor&, at::Tensor&, at::Tensor&, at::Tensor&, int, int, int, int, int, int) + 0x58f (0x7f7d7e4c62af in /home/jnehvi/.local/lib/python3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_cuda.cpython-36m-x86_64-linux-gnu.so) frame #2: + 0x1c425 (0x7f7d7e4d4425 in /home/jnehvi/.local/lib/python3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_cuda.cpython-36m-x86_64-linux-gnu.so) frame #3: + 0x1c6ae (0x7f7d7e4d46ae in /home/jnehvi/.local/lib/python3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_cuda.cpython-36m-x86_64-linux-gnu.so) frame #4: + 0x19851 (0x7f7d7e4d1851 in /home/jnehvi/.local/lib/python3.6/site-packages/correlation_cuda-0.0.0-py3.6-linux-x86_64.egg/correlation_cuda.cpython-36m-x86_64-linux-gnu.so)

frame #11: THPFunction_apply(_object*, _object*) + 0x9ff (0x7f7e2995490f in /HPS/DFD_19/work/anaconda3/envs/myenv/lib/python3.6/site-packages/torch/lib/libtorch_python.so) Segmentation fault
Hakusye commented 4 years ago

What is the version of gcc when running "scripts/download_flownet2.py"?

jalees018 commented 4 years ago

GCC version 6.3.0 Thanks!

jalees018 commented 4 years ago

The flownet installation works correctly when I run the above script

jalees018 commented 4 years ago

It worked! One must have the same CUDA version nvcc --version on the GPU and the installed packages- CUDA Toolkit, Pytorch, torchvision and cudnn