Open zhongzee opened 6 months ago
chmod +x tools/dist_train.sh
./tools/dist_train.sh configs/pretrain/yolo_world_l_t2i_bn_2e-4_100e_4x8gpus_obj365v1_goldg_train_lvis_minival.py 8 --amp 这个命令只适合单服务器的,如和改成多服务器的形式呢?nodes和node_rank要怎么设置呢?
nnodes 设置为机器的数量, node_rank 设置为每台机机器的rank,详情请参考:https://pytorch.org/tutorials/intermediate/ddp_series_multinode.html
nnodes
node_rank
chmod +x tools/dist_train.sh
sample command for pre-training, use AMP for mixed-precision training
./tools/dist_train.sh configs/pretrain/yolo_world_l_t2i_bn_2e-4_100e_4x8gpus_obj365v1_goldg_train_lvis_minival.py 8 --amp 这个命令只适合单服务器的,如和改成多服务器的形式呢?nodes和node_rank要怎么设置呢?