Closed XuLongjia closed 2 years ago
It is used for DDP distributed training which gives a great speed up. I'll update the read me.
Please refer the script. Hope it helps. https://github.com/namisan/mt-dnn/blob/master/experiments/glue/run_glue_finetuning.sh#L112
parser.add_argument("--local_rank", type=int, default=-1, help="For distributed training: local_rank")
I don‘t know it’s meaning and what role playing in model training, Looking forward to your answer! Thanks!