Closed mwawrzos closed 3 years ago
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅
Hi @petermattson , could someone from Google also approve? I want to make sure all submitters are good on these PRs since we don't have time to discuss in the SWG. I couldn't assign to Elias - he seems to be missing from the project.
+Elias Mizan @.***> Could you please take a quick look at this? Github issue prevents me from adding you as a reviewer. Thanks much! :-)
On Fri, Apr 16, 2021 at 9:15 AM johntran-nv @.***> wrote:
Hi @petermattson https://github.com/petermattson , could someone from Google also approve? I want to make sure all submitters are good on these PRs since we don't have time to discuss in the SWG. I couldn't assign to Elias - he seems to be missing from the project.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mlcommons/training/pull/461#issuecomment-821286808, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIIVUHKHE7JQ5BSLTD5YEADTJBPD3ANCNFSM427Z6FOQ .
I approved, but it says that you have to approve.
On Mon, Apr 19, 2021 at 5:20 PM Peter Mattson @.***> wrote:
+Elias Mizan @.***> Could you please take a quick look at this? Github issue prevents me from adding you as a reviewer. Thanks much! :-)
On Fri, Apr 16, 2021 at 9:15 AM johntran-nv @.***> wrote:
Hi @petermattson https://github.com/petermattson , could someone from Google also approve? I want to make sure all submitters are good on these PRs since we don't have time to discuss in the SWG. I couldn't assign to Elias - he seems to be missing from the project.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mlcommons/training/pull/461#issuecomment-821286808, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIIVUHKHE7JQ5BSLTD5YEADTJBPD3ANCNFSM427Z6FOQ .
apex.optimizers.FuseLAMB includes gradient clipping with max gradient norm set to
1
(documentation, code).The reference implementation contains the parameter
clip_norm
:https://github.com/mlcommons/training/blob/8f7f74f88874ae85a58ddedd778c320739b37444/rnn_speech_recognition/pytorch/train.py#L86-L87 This parameter relates to gradient clipping done out of the optimizer:https://github.com/mlcommons/training/blob/8f7f74f88874ae85a58ddedd778c320739b37444/rnn_speech_recognition/pytorch/train.py#L466-L470 This parameter is frozen toNone
(training_policies, compliance checker). Such constants may mislead submitters, suggesting that the reference doesn't clip gradients.This PR is to avoid such confusion. Behavior stays unchanged. The change only exposes the default value from the optimizer. To minimize the impact of late change, submitters are allowed to use a parameter value either equal to
1
or equal toinf
(see https://github.com/mlcommons/training_policies/pull/433).