Closed MrSworder closed 3 years ago
Hi, I didn't directly write the option to make use of multi-GPU training. But, fastai, the framework which this repo is based on, supports multi-GPU training. So, you should be able to train on multi-GPU by referencing the document of fastai (BTW, this code is compatible with fastai 2.1.10).
Additionally, according to others have tried to train on multi-GPU with this repo, adding learn.model = torch.nn.DataParallel(learn.model, device_ids=[0,1,2,3])
after initialization of Learner
(https://github.com/richarddwang/electra_pytorch/blob/ab29d03e69c6fb37df238e653c8d1a81240e3dd6/pretrain.py#L388-L396) might be a good thing to try.
Please tag me to reopen this issue if there's anything else I can help you.
你好,我正在尝试ELECTRA的方法训练自己的预训练模型,我在其他地方看到你的代码实现了ELECTRA的多GPU并行,但是我在尝试运行的时候发现只有一个GPU在运行,我的num_workers被设置为4,num_proc被设置为4,尝试运行时用的数据大小约0.5k。请问要实现多GPU并行还需要注意哪里?
Hi, I am trying ELECTRA to train my pre-training model, I learnt elsewhere that your code implements ELECTRA's multi-GPU parallelization, but When I tried to run it, I found that only one GPU was running. My num_workers was set to 4, num_proc was set to 4. The data size used for the attempted run is about 0.5K. What else should be change when implementing multi-GPU parallelism?