mlcommons / training_policies

Issues related to MLPerf™ training policies, including rules and suggested changes
https://mlcommons.org/en/groups/training
Apache License 2.0
92 stars 66 forks source link

[MRCNN] Update 2.0 rules to accommodate small batch sizes #491

Closed ShriyaPalsamudram closed 2 years ago

ShriyaPalsamudram commented 2 years ago

We noticed that the hyperparameters for mask-rcnn only make sense for batch size 16 and above, given the way the reference works. To allow submissions with batch size < 16, we propose to allow the learning rate to scale with 0.2 * 1/K as well.

github-actions[bot] commented 2 years ago

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

ShriyaPalsamudram commented 2 years ago

recheck

johntran-nv commented 2 years ago

@mlcommons/wg-training Can folks please review this ahead of this week's working group meeting? Sorry for the late change but we just noticed this.

Tagging some folks explicitly as well: @petermattson @emizan76 (can't find eric's handle)

emizan76 commented 2 years ago

You mean 0.02 * (1/K), right? Has the convergence behavior for low batch sizes been studied?

petermattson commented 2 years ago

OK, hearing no objection, going to let this one in.

On Thu, May 19, 2022 at 6:46 AM Elias Mizan @.***> wrote:

You mean 0.02 * (1/K), right? Has the convergence behavior for low batch sizes been studied?

— Reply to this email directly, view it on GitHub https://github.com/mlcommons/training_policies/pull/491#issuecomment-1131197111, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIIVUHOUE6ZE6VWWILYETJDVKXBR7ANCNFSM5WCW2ISA . You are receiving this because you were mentioned.Message ID: @.***>