Closed echo840 closed 2 months ago
You do not need to modify the code. If you are using Slurm, you should use this script:
Else you are expected to use this:
In this case, you need to set the MASTER_ADDR
, PORT
, NODE_RANK
, NNODES
, and GPUS
by yourself.
Does this code support multi-machine training, and how should I modify the training command?