Closed lifuguan closed 1 year ago
Hi,
Thank you for your interest in our work! To train on multiple GPUs,
python3 -m torch.distributed.launch --nnodes=1 --node_rank=0 --nproc_per_node=<num-gpus> --use_env --master_port=21221 train.py ... (remaining args)
Thanks! One more question following the issue, if I train the model with 8 gpus, should I change N_rand
from 4096 to 512?
Yes, thats correct!
Hello, thanks for the great work! I'm wondering how can we apply mulit-gpu training?
I use the following command
but it occurs the following problems:
The distributed training code of
train.py
is shown below: