wangqiangneu / MT-PaperReading

Record my paper reading about Machine Translation and other related works.
36 stars 2 forks source link

20-ACL-Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation #65

Open wangqiangneu opened 4 years ago

wangqiangneu commented 4 years ago

简介

mask-predict的改进。先整了几个实验验证了encoder在NAT中更重要(实际跟之前NMT的结论一样,encoder对性能影响更大)。然后就各种mask开搞。具体做法是,encoder中类似bert进行mask并预测,decoder则套了个ngram loss (n=2),最后再跟AT的loss插值在一起训练。解码的时候跟mask-predict类似,也是iterative refinement,只不过mask的不是单个token,而是训练时采用的ngram

有意思的点

论文信息

总结