Some confusion when using GradScaler with multiple optimizer

If I want to use multiple optimizers and use AMP ，I inherited and rewrote the other relevant code but ran into problems with the gradient backwards here Original：

        if self.engine_cfg['enable_float16']:
            self.Scaler.scale(loss_sum).backward()
            self.Scaler.step(self.optimizer)
            scale = self.Scaler.get_scale()
            self.Scaler.update()
            # Warning caused by optimizer skip when NaN
            # https://discuss.pytorch.org/t/optimizer-step-before-lr-scheduler-step-error-using-gradscaler/92930/5
            if scale != self.Scaler.get_scale():
                self.msg_mgr.log_debug("Training step skip. Expected the former scale equals to the present, got {} and {}".format(
                    scale, self.Scaler.get_scale()))
                return False

Rewritten:

        if self.engine_cfg['enable_float16']:
            self.Scaler.scale(loss_sum).backward()
            # self.optimizer as a list consist of different optimizer 
            for optimizer in self.optimizer:
                self.Scaler.step(optimizer)
            scale = self.Scaler.get_scale()
            self.Scaler.update()
            # Warning caused by optimizer skip when NaN
            # https://discuss.pytorch.org/t/optimizer-step-before-lr-scheduler-step-error-using-gradscaler/92930/5
            if scale != self.Scaler.get_scale():
                self.msg_mgr.log_debug("Training step skip. Expected the former scale equals to the present, got {} and {}".format(
                    scale, self.Scaler.get_scale()))
                return False

then in if scale != self.Scaler.get_scale() ,scale will always be the half of self.Scaler.get_scale() I know this is a problem in my code, but I don't know how to use self.Scaler correctly. I hope you can guide the correct modification examples , thanks!

ShiqiYu / OpenGait

Some confusion when using GradScaler with multiple optimizer #147