Hi, great work! but I found this error occurred when I modified the 'GPU_ID:0' to 'GPUID:6' in 'config.yml' and tried to run train.py
在尝试使用另一块GPU训练的时候,在修改 'config.yml' 中的 'GPU_ID:0' 为 'GPUID:6'之后报了错
找了半天也没找到为什么还会有tensor被送到别的GPU上
Traceback (most recent call last):
File "train.py", line 326, in
main()
File "train.py", line 121, in main
train(train_loader, model, optimizer, epoch, writer, args)
File "train.py", line 171, in train
keypoint_map = model(images)
File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, kwargs)
File "/x/x.x/x/x/x/deep-hough-transform/model/network.py", line 69, in forward
p1 = self.dht_detector1(p1)
File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, *kwargs)
File "/x/x.x/x/x/x/deep-hough-transform/model/dht.py", line 26, in forward
x = self.convs(x)
File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(input, kwargs)
File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/container.py", line 139, in forward
input = module(input)
File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 443, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 440, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:6! (when checking arugment for argument weight in method wrapper_cudnn_convolution)
Hi, great work! but I found this error occurred when I modified the 'GPU_ID:0' to 'GPUID:6' in 'config.yml' and tried to run train.py 在尝试使用另一块GPU训练的时候,在修改 'config.yml' 中的 'GPU_ID:0' 为 'GPUID:6'之后报了错 找了半天也没找到为什么还会有tensor被送到别的GPU上
Traceback (most recent call last): File "train.py", line 326, in
main()
File "train.py", line 121, in main
train(train_loader, model, optimizer, epoch, writer, args)
File "train.py", line 171, in train
keypoint_map = model(images)
File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, kwargs)
File "/x/x.x/x/x/x/deep-hough-transform/model/network.py", line 69, in forward
p1 = self.dht_detector1(p1)
File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, *kwargs)
File "/x/x.x/x/x/x/deep-hough-transform/model/dht.py", line 26, in forward
x = self.convs(x)
File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(input, kwargs)
File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/container.py", line 139, in forward
input = module(input)
File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 443, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 440, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:6! (when checking arugment for argument weight in method wrapper_cudnn_convolution)