jia-zhuang / pytorch-multi-gpu-training

整理 pytorch 单机多 GPU 训练方法与原理
695 stars 80 forks source link

指出一点改进 #2

Open pengzhangzhi opened 2 years ago

pengzhangzhi commented 2 years ago

并行化模型的时候可以加上一步操作:

# Convert BatchNorm to SyncBatchNorm. 
net = nn.SyncBatchNorm.convert_sync_batchnorm(net)

确保batch norm 在所有process上sync了。

参考: https://theaisummer.com/distributed-training-pytorch/#step-1-initialize-the-distributed-learning-processes

jia-zhuang commented 11 months ago

好的,谢谢,我研究一下