Open one-molamola opened 1 year ago
You use single-GPU training in your code, I tried to train with multiple GPUs, but the speed dropped, what is the reason? I modified the following code
You use single-GPU training in your code, I tried to train with multiple GPUs, but the speed dropped, what is the reason? I modified the following code
model = torch.nn.DataParallel(model, device_ids=device_ids).cuda()
tokens, mask, prefix = tokens.to(device=device_ids[0]), mask.to(device=device_ids[0]), prefix.to(device=device_ids[0]).to(torch.float32)