Closed kuang23 closed 2 years ago
Hi, those settings aren't necessarily the best or the definitive configuration. Specifically these adjustments were needed when I was experimenting with multi-GPU training (two GPUs). It's a bit of historic cruft, the training code is not super streamlined and cleaned up in this sense. Take it as just one possible example training configuration.
Ok I see Thanks for your reply !
Hi,
Firstly, Thanks for your great work! After studying code, I am wondering the setting of batch size and learning rate in main.py (1) Why batch_size is divided by 2 in line 70 ? (2) Why lr_schedule is divided by sqrt(n_replicas) in line 174 ?