Closed jianganghuang closed 10 months ago
The ddp command looks fine, can you show more detailed error information.
As is shown in the figure, I train with the DDP commond,but failed; My environment is "python 3.8.16, 1.11.0+cu102”
OK,I have found the problem, thanks
好的,我找到了问题,谢谢
你好,请问是什么问题,我也遇到了这个问题,该如何修改呢
I use your commond "python -m torch.distributed.launch --nproc_per_node 8 tools/train.py --device 1,2 --batch 32 ", it will report I should use "torch.distributed.run", then I use "torch.distributed.run",it still can't train, the bug report information is below .How to use DDP trainning commond?