Open zomux opened 4 years ago
Training
abcirun python lanmt/lm.py --root $HOME/data/wmt14_ende_fair --opt_dtok wmt14_fair_ende --opt_batchtokens 8192 --opt_distill --train
Problem
After refine the vectors for many times, the resultant sentence is still not meaningful
<s> Gut@@ ach : Noch ach Sicherheit . . ger . .
Variations
Problem
loss = copy loss + noise correction loss
qrsh -g gcb50249 -l rt_F=2 $HOME/research/abcirun.sh python lanmt/lm.py --root $HOME/data/wmt14_ende_fair --opt_dtok wmt14_fair_ende --opt_batchtokens 4096 --opt_distill --opt_modeltype realgrad --opt_nrefine 1 --train
Idea:
directly maximize the cross entropy of token prediction based on the refined vectors