The goal of this project is to build a neural machine translation system and experience how recent advances have made their way. We will build the following sequence of neural translation systems for two language pairs, Vietnamese (Vi)→English (En) and Chinese (Zh)→En (prepared corpora will be provided):
Recurrent neural network based encoder-decoder without attention Recurrent neural network based encoder-decoder with attention Replace the recurrent encoder with either convolutional or self-attention based encoder. [Optional] Build either or both fully self-attention translation system or/and multilingual translation system.
We are expected to implement these on our own (if necessary), experiment them with both language pairs, report their performance (measured in terms of automatic evaluation metrics) and analyze their behaviours and properties.
The final report should include the description of the task, models, experiments and conclusion and be up to 6 pages long excluding unlimited pages reserved for references. It must be prepared in LaTeX in conformance with NAACL style (a.k.a. ACL style for 8.5x11” paper). Our paper should include a link to a public GitHub repository containing the code used in your experiments. Contribution Statements The final report must state (in one or two sentences) the contributions of each team member. Team projects which fail to include this will receive a 50% grade deduction.