Closed vreis closed 3 years ago
This pull request was exported from Phabricator. Differential Revision: D24740329
This pull request was exported from Phabricator. Differential Revision: D24740329
This pull request was exported from Phabricator. Differential Revision: D24740329
This pull request was exported from Phabricator. Differential Revision: D24740329
This pull request has been merged in facebookresearch/ClassyVision@1ba6b0354e87c41b5eb26934888d66a55e198b37.
Summary: Add support for gradient accumulation. In the task config, you can specify a "simulated_batch_size" argument that will accumulate gradients and simulate the specified global batch size. Hard to be careful to make sure that works well with grad clipping, but got some tests covering that.
Differential Revision: D24740329