Open thangk opened 5 months ago
@thangk thanks for the summary. But I'm not sure I understood the problem and how they address it, in a high-level. An example would clarify.
Also,
btw, for our work, we need to show the effect of the translator we use for our task. So, it's good to see if the proposed translators can be executed within openmt. Otherwise, we have to use their code, which is difficult.
@thangk thanks for the summary. But I'm not sure I understood the problem and how they address it, in a high-level. An example would clarify.
Also,
- venue on the title,
- codebase
btw, for our work, we need to show the effect of the translator we use for our task. So, it's good to see if the proposed translators can be executed within openmt. Otherwise, we have to use their code, which is difficult.
Sure, I'll make an update to this.
Link: ACL Anthology
Main problem
Existing test sets such as IATE, Wiktionary, and TICO employ oversimplified constraint settings and thus leave room for improvement on accuracy and translation quality.
Proposed method
Author proposes an approach which leverages a combination of two methods Place-Holder (PH) and Code-Switch (CS) which brings their advantages together to produce results that are high in accuracy and high in translation quality simultaneously-Robust Terminology Translation (RTT) model.
My Summary
The proposed method (RTT) yielded better performance than PH and CS which are either proficient in BLEU/COMET or SCA by achieving high translation quality and constraint accuracy at the same time. RTT achieves BLEU score of 40.2 in average, COMET score of 0.4866. In SCA test, RTT falls slightly behind PH but leads performance gap over CS of about 20%. In the ablation study, two components term embedding (TermE) and loss masking (LM) yielded best results. One of the major drawbacks of this method is due to its hard copy mechanism which may show performance inefficiency in more complex languages such as Arabic where phrases or terminologies often contain conjunctions or prepositions and thus this area is open to further studies.
Datasets
Custom owned test set derived from English-German (to make a challenging test set) WMT16 English-German WMT18 English-German (Europarl, News Commentary) IATE Wiktionary