Averaged Gradients - Githubissues

kmaninis / OSVOS-PyTorch

PyTorch implementation of One-Shot Video Object Segmentation (OSVOS)

http://vision.ee.ethz.ch/~cvlsegmentation/osvos

GNU General Public License v3.0

564 stars 106 forks source link

Averaged Gradients #10

Closed marcociccone closed 6 years ago

marcociccone commented 6 years ago

Hi, can you please clarify on why you average the gradients every n iterations? Is it a way to increase minibatch size if the batch does not fit in memory? Thanks!

kmaninis commented 6 years ago

Hi. Exactly! The gradients are smoother when averaged across many training examples.

marcociccone commented 6 years ago

Ok thanks! This is a smart idea, but I think it doesn't work in combination with batchnorm or other normalization methods where you want to compute the statistics over the batch.

kmaninis commented 6 years ago

Yes, that's true. We did not include batchnorm in our code, though. I guess it is a good compromise when GPU memory gets full.