Closed Polymorphy12 closed 1 year ago
Hello, I'm using 2 GPUs with a single node to train Atlas.
However, even if I set the local_rank to 0, the training doesn't start. It still requires MASTER_ADDR, MASTER_PORT, etc.
Is there any additional information to notice?
I am afraid I can't help if you don't add more context of how exactly are you trying to do this.
Hello, I'm using 2 GPUs with a single node to train Atlas.
However, even if I set the local_rank to 0, the training doesn't start. It still requires MASTER_ADDR, MASTER_PORT, etc.
Is there any additional information to notice?