keskarnitish / large-batch-training

Code to reproduce some of the figures in the paper "On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima"
MIT License
138 stars 23 forks source link

pytorch gpu #5

Open wenwei202 opened 6 years ago

wenwei202 commented 6 years ago

Is there any problem that makes the implementation of GPU version difficult? I tried to get a linear combination of SB weights and LB weights in GPU mode, and got weird issues. Did you have similar problems before?