Closed youngwanLEE closed 3 years ago
Hi @youngwanLEE, thank you for your interest in our work!
For your batch size concern, actually 256 means the batch size per GPU. The 8-GPU setting is used in the default training command and it has a total batch size 2048. You should be able to re-produce the result accuracy (similar to our reported ones) using the command provided in this repo.
Yes. The AMP is enabled at default.
@xwjabc Thanks for your quick reply :)
Hi, I'm very impressed by your excellent work! Thanks for sharing your code.
I have questions about the training protocol.
In your paper,
but the training script denotes the batch size of 256, instead of 2048.
I wonder two points from here.
1) Can I re-produce the result accuracy in this repo by using this command (batch size=256, instead of 2048)?
2) Does this repo contains AMP?
Thanks in advance :)