Closed ShriyaPalsamudram closed 2 years ago
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅
recheck
@mlcommons/wg-training Can folks please review this ahead of this week's working group meeting? Sorry for the late change but we just noticed this.
Tagging some folks explicitly as well: @petermattson @emizan76 (can't find eric's handle)
You mean 0.02 * (1/K), right? Has the convergence behavior for low batch sizes been studied?
OK, hearing no objection, going to let this one in.
On Thu, May 19, 2022 at 6:46 AM Elias Mizan @.***> wrote:
You mean 0.02 * (1/K), right? Has the convergence behavior for low batch sizes been studied?
— Reply to this email directly, view it on GitHub https://github.com/mlcommons/training_policies/pull/491#issuecomment-1131197111, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIIVUHOUE6ZE6VWWILYETJDVKXBR7ANCNFSM5WCW2ISA . You are receiving this because you were mentioned.Message ID: @.***>
We noticed that the hyperparameters for mask-rcnn only make sense for batch size 16 and above, given the way the reference works. To allow submissions with batch size < 16, we propose to allow the learning rate to scale with 0.2 * 1/K as well.