keskarnitish / large-batch-training

Code to reproduce some of the figures in the paper "On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima"
MIT License
138 stars 23 forks source link

Is there a Caffe implementation? #2

Open fastalgo opened 7 years ago

fastalgo commented 7 years ago

Is there a Caffe implementation?

Thanks!

keskarnitish commented 7 years ago

Unfortunately, we don't have one. However, the process of building it is identical. All you need are the two solutions and a function that computes the loss/accuracy on intermediate points. If anyone can build the example, we can surely take a look at it.