Closed Eric8932 closed 1 year ago
If tie_weights and having lm_target, the lm_target should be consistent with tgt_embedding first.
If tie_weights and having lm_target, the lm_target should be consistent with tgt_embedding first.