@inproceedings{wang-etal-2018-alibaba,
title = "{A}libaba Submission for {WMT}18 Quality Estimation Task",
author = "Wang, Jiayi and
Fan, Kai and
Li, Bo and
Zhou, Fengming and
Chen, Boxing and
Shi, Yangbin and
Si, Luo",
booktitle = "Proceedings of the Third Conference on Machine Translation: Shared Task Papers",
month = oct,
year = "2018",
address = "Belgium, Brussels",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/W18-6465",
doi = "10.18653/v1/W18-6465",
pages = "809--815",
}
1. What is it?
They proposed a strong QE model, QE-Brain which achieved No.1 results into word/sentence level QE in WMT18.
2. What is amazing compared to previous studies?
They used Bi-directional Transformer LM and Bi-LSTM.
Moreover, they proposed objective function to make use of extracted features.
3. Where is the key to technologies and techniques?
Architectures
Bi-directional Transformer LM: extract latent semantic features between source and MT output.
Bi-LSTM: calculate QE scores.
Strategies
Objective function:
In general, objective function is here
h is the reference HTER score, w is vector, h(→) and h(←) are the hidden states of bi-directional LSTM.
But they proposed a new objective function using 17 features f from baseline, QuEST++.
Data argumentation: by round-trip translation, named Automatic Post-Editing
First, train using artificial QE data. Then, fine-tune using PURE QE data.
Ensemble: by greedy method
4. How did validate it?
Tried word and sentence QE tasks.
They achieved SoTA in both tasks.
0. Paper
@inproceedings{wang-etal-2018-alibaba, title = "{A}libaba Submission for {WMT}18 Quality Estimation Task", author = "Wang, Jiayi and Fan, Kai and Li, Bo and Zhou, Fengming and Chen, Boxing and Shi, Yangbin and Si, Luo", booktitle = "Proceedings of the Third Conference on Machine Translation: Shared Task Papers", month = oct, year = "2018", address = "Belgium, Brussels", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/W18-6465", doi = "10.18653/v1/W18-6465", pages = "809--815", }
1. What is it?
They proposed a strong QE model, QE-Brain which achieved No.1 results into word/sentence level QE in WMT18.
2. What is amazing compared to previous studies?
They used Bi-directional Transformer LM and Bi-LSTM. Moreover, they proposed objective function to make use of extracted features.
3. Where is the key to technologies and techniques?
Architectures
Strategies
h is the reference HTER score, w is vector, h(→) and h(←) are the hidden states of bi-directional LSTM.
But they proposed a new objective function using 17 features f from baseline, QuEST++.
4. How did validate it?
Tried word and sentence QE tasks. They achieved SoTA in both tasks.
5. Is there a discussion?
6. Which paper should read next?
OpenKiwi
Automatic Post-Editing