Closed Eli-YiLi closed 3 years ago
THE MATTER IS torch.nn.DataParallel
--ntasks=1 IN SRUN
Hi,
I have got the same problem when running the code with an 8*GPU(a100) server, the process just stuck on this line model = torch.nn.DataParallel(model).cuda()
, so what should I do to solve this?
I CANNOT RUN IT EVEN USE 8 GPUs (ONE IMG PER GPU)