Multi-GPU Training - Githubissues

JKBox / YOLOv3-quadrangle

YOLOv3 detector with quadrangle in PyTorch

88 stars 29 forks source link

Multi-GPU Training #1

Open LcenArthas opened 5 years ago

LcenArthas commented 5 years ago

Hi, Have you tried to run training on multiple gpus?

JKBox commented 5 years ago

Hi, Have you tried to run training on multiple gpus?

Thanks to your reminder, I wrote the code with single gpu, I will change it to multiple gpus later.

LcenArthas commented 5 years ago

I tried, but failed TAT.....,but i found this： https://github.com/ultralytics/yolov3/pull/121 . I tried to fix the code,but failed. I hope it can help u :)

LcenArthas commented 5 years ago

by the way. i have fix the code follow by that url, and it can run in the multiple gpus, but it sooooo slow. So i think i have made the wrong code

longxianlei commented 5 years ago

os.environ["CUDA_VISIBLE_DEVICES"] = "4,5,6,7" if torch.cuda.device_count() > 1: model = nn.DataParallel(model, device_ids=[0, 1, 2, 3]) model.to(device).train() I have 8 GPUs. I set 4 of my device visiable. Then i use the model to parallel to these GPUs. but when i run the train.py. inter_area = torch.min(box1, box2).prod(2) RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.FloatTensor for argument #2 'other' Is the code didn't support multi GPU training now.

JKBox commented 5 years ago

os.environ["CUDA_VISIBLE_DEVICES"] = "4,5,6,7" if torch.cuda.device_count() > 1: model = nn.DataParallel(model, device_ids=[0, 1, 2, 3]) model.to(device).train() I have 8 GPUs. I set 4 of my device visiable. Then i use the model to parallel to these GPUs. but when i run the train.py. inter_area = torch.min(box1, box2).prod(2) RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.FloatTensor for argument #2 'other' Is the code didn't support multi GPU training now.

yes, the code only support single GPU training currently, I'll fix it later