Hanqer / deep-hough-transform

Jittor and Pytorch code for paper "Deep Hough Transform for Semantic Line Detection" (ECCV 2020, PAMI 2021)
344 stars 71 forks source link

Training with multi GPUS #29

Closed FURYTAIL closed 2 years ago

FURYTAIL commented 2 years ago

Hi, great work! but I found this error occurred when I modified the 'GPU_ID:0' to 'GPUID:6' in 'config.yml' and tried to run train.py 在尝试使用另一块GPU训练的时候,在修改 'config.yml' 中的 'GPU_ID:0' 为 'GPUID:6'之后报了错 找了半天也没找到为什么还会有tensor被送到别的GPU上

Traceback (most recent call last): File "train.py", line 326, in main() File "train.py", line 121, in main train(train_loader, model, optimizer, epoch, writer, args) File "train.py", line 171, in train keypoint_map = model(images) File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, kwargs) File "/x/x.x/x/x/x/deep-hough-transform/model/network.py", line 69, in forward p1 = self.dht_detector1(p1) File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, *kwargs) File "/x/x.x/x/x/x/deep-hough-transform/model/dht.py", line 26, in forward x = self.convs(x) File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(input, kwargs) File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/container.py", line 139, in forward input = module(input) File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 443, in forward return self._conv_forward(input, self.weight, self.bias) File "/x/x.x/x/x/x/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 440, in _conv_forward self.padding, self.dilation, self.groups) RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:6! (when checking arugment for argument weight in method wrapper_cudnn_convolution)

zeakey commented 2 years ago

You'd better to use CUDA_VISIBLE_DEVICES=x python xxx.py to set the GPU id. Do not change the gpu_id in config.