Closed haoshuai714 closed 2 years ago
Hi, I'm not sure why your program is stuck, maybe you want to check if you can run pytorch distributed training with other scripts.
hi,i am encounter the same problem。could you tell me how can i solve the problem?
2.The program is stuck, as shown below:
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
| distributed init (rank 2): env:// | distributed init (rank 6): env:// | distributed init (rank 0): env:// | distributed init (rank 3): env:// | distributed init (rank 4): env:// | distributed init (rank 5): env:// | distributed init (rank 7): env:// | distributed init (rank 1): env://