Closed 2609780105 closed 9 months ago
用的单个GPU,训练的结果比你这差了五六个点,是为什么?有没有可以改进的办法。
Probably larger batch size is all you need, otherwise it will take extremely long time.
用的单个GPU,训练的结果比你这差了五六个点,是为什么?有没有可以改进的办法。