flrngel / understanding-ai

personal repository
36 stars 6 forks source link

Non-Autoregressive Neural Machine Translation #6

Open flrngel opened 6 years ago

flrngel commented 6 years ago

https://arxiv.org/abs/1711.02281

Abstract

Features

How

1. Introduction

Paper model uses CNN and SAN (Transformer) to avoid autoregressive

2. Background

2.1. Autoregressive Neural Machine Translation

They made output length variable T as probabilistic variable

2.3. The multimodality problem

Multimodality problem is problem of "high multimodal distribution of target translation"

3. The non-autoregressive transformer

image

3.3. Modeling fertility to tackle the multimodality problem

Used IBM Model 2 to use fertilities.

Definition of fertilities and it's benefit

3.4. Translation predictor and the decoding process

4. Training

I didn't like this section

image

4.2. Fine-Tuning

Uses KL Divergence, RL, backpropagation

Word-level knowledge distillation (Teacher) image

External fertility inference model image

Todo