Closed fido20160817 closed 2 years ago
I use python -m torch.distributed.launch --nproc_per_node=4 ... to train in a multiGPU way, but it is stuck in "dist_util.setup_dist()" which is the first sentence of main. Any body know the reason?
remember to put model on cuda firstly.
I use python -m torch.distributed.launch --nproc_per_node=4 ... to train in a multiGPU way, but it is stuck in "dist_util.setup_dist()" which is the first sentence of main. Any body know the reason?