cindyxinyiwang / multiDDS

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"
MIT License
23 stars 9 forks source link

Trainer Gradient Update For Scorer in both train_step and update_language_sampler #3

Closed steventan0110 closed 3 years ago

steventan0110 commented 3 years ago

Hi Cindy,

I was studying your code for in trainer.py and it seems like that you perform update for RL scorer (the data actor) in both update_language_sampler function and train_step function. Initially I thought you only update the RL in update_language_sampler() where you compute the cosine similarity of two gradients, but then I saw this block of code (which seems to only update ave_emb actor, so I wonder if you actually use this block of code?)

# optimize data actor
            for k in cached_loss.keys():
                reward = 1./eta * (cur_loss[k] - cached_loss[k])
                if self.args.out_score_type == 'sigmoid':
                    #loss = -(torch.log(1e-20 + data_actor_out[k]) * reward.data)
                    loss = -(data_actor_out[k] * reward.data)
                elif self.args.out_score_type == 'exp':
                    loss = -(torch.log(1e-20 + data_actor_out[k]) * reward.data)
                if cur_loss[k].size(0) > 0:
                    loss.div_(cur_loss[k].size(0))
                loss.sum().backward()
            if self.args.data_actor == 'ave_emb': 
                self.data_optimizer.step()
                self.data_optimizer.zero_grad()

Thank you for your help and clarification!

cindyxinyiwang commented 3 years ago

Hi,

Sorry I didn't see the question earlier. No I think the data sampling distribution is only updated in update_language_sampler in trainer. Not sure where the code you saw is from but it should probably not be used. Sorry if it's some code I didn't clean up!