Open akakakakakaa opened 4 years ago
In pretrain get_loss function, loss_lm is calculated by mean.
Because of this, all zero values in loss_lm handles as a correct answer.
So, I think we need to change mean to numerator / denominator like tensorflow.
loss_lm = (loss_lm masked_weights.float()).mean() to loss_lm_numerator = (loss_lmmasked_weights.float()).sum() loss_lm_denominator = masked_weights.sum() + 1e-5 loss_lm = loss_lm_numerator / loss_lm_denominator
Is it correct?
In pretrain get_loss function, loss_lm is calculated by mean.
Because of this, all zero values in loss_lm handles as a correct answer.
So, I think we need to change mean to numerator / denominator like tensorflow.
loss_lm = (loss_lm masked_weights.float()).mean() to loss_lm_numerator = (loss_lmmasked_weights.float()).sum() loss_lm_denominator = masked_weights.sum() + 1e-5 loss_lm = loss_lm_numerator / loss_lm_denominator
Is it correct?