Gradient accumulation - Githubissues

facebookresearch / ClassyVision

An end-to-end PyTorch framework for image and video classification

https://classyvision.ai

MIT License

1.59k stars 278 forks source link

Gradient accumulation #644

Closed vreis closed 3 years ago

vreis commented 3 years ago

Summary: Add support for gradient accumulation. In the task config, you can specify a "simulated_batch_size" argument that will accumulate gradients and simulate the specified global batch size. Hard to be careful to make sure that works well with grad clipping, but got some tests covering that.

Differential Revision: D24740329