propose MIXER training scheme to resolve the exposure bias issue
Details
exposure bias
at train time, model is trained to predict next token given the previous ground-truth token, but at test time, the model predicts next token given its own predicted tokens
the discrepancy arising from model being exposed to training data distribution only, instead of its own predictions is an exposure bias
MIXER
trains in XENT-only fashion for first N_xent steps, then use XENT/REINFORCE loss interchangeably, eventually training the whole sequence with REINFORCE
Mixed Incremental Cross-Entropy Reinforce : combining cross-entropy with REINFORCE in incremental learning(i.e. curriculum learning) fashion to mitigate exposure bias in cross-entropy training, and resolve exponentially large search space issue in REINFORCE training
pseudo code
Experiment
Machine Translation task : IWSLT14 EnDe, 153K training data, model with 256-dim LSTM
MIXER does improve BLEU about +3.0 points
Personal Thoughts
exposure bias is an important issue in MT
there are other bias including greedy/beam search discrepancy in train/test phase
let's look into more papers regarding the above issue
MIXER seems to be a useful, agnostic trick to improve MT results, but did not see wide usage ~ perhaps due to unstability of REINFORCE
Abstract
MIXER
training scheme to resolve theexposure bias
issueDetails
exposure bias
MIXER
N_xent
steps, then use XENT/REINFORCE loss interchangeably, eventually training the whole sequence with REINFORCEExperiment
MIXER
does improve BLEU about +3.0 pointsPersonal Thoughts
exposure bias
is an important issue in MTbias
including greedy/beam search discrepancy in train/test phaseMIXER
seems to be a useful, agnostic trick to improve MT results, but did not see wide usage ~ perhaps due to unstability of REINFORCELink : https://arxiv.org/pdf/1511.06732.pdf Authors : Ranzato et al. 2016