la-serene / English-German-Translation-System

An English - German Neural Machine Translation System informed by Bahdanau et al. (2014).
https://la-serene.github.io/deploy-tf-model-on-aws/
Apache License 2.0
0 stars 0 forks source link
attention-mechanism aws fastapi nlp seq2seq tensorflow

Introduction

Goolge Colab

An English - German Neural Machine Translator informed by Bahdanau Attention paper by Bahdanau et al. (2014).

Note: Since this project is carried out for educational purpose, the overall performance may not be satisfying.

Table of contents

Installation

After cloning the repository, install the required packages by running the following command in the terminal.

pip install -r requirements.txt

Usage

Pass the desired model name (look at weights.jsonl for further information) to the CLI before entering the source English sentence.

python predict.py

Here is a description on the command-line arguments:

--model_name: Name of the saved model. Default: "v9".

Currently, there is no additional configurable command-line arguments. In the future, an optional argument for temperature might be available.

Data

The dataset is taken from Kaggle Dataset English To German, consisting of around 277,891 English-German pair of sentences and randomly split into 3 sets: training, evaluation and testing with 0.8, 0.1, 0,1 ratio respectively.

Training

This project is now available for training on any English-German dataset. It's also noteworthy that the custom dataset should follow the same pattern as in dataset folder.

Here is a description on the command-line arguments:

--data_path: Path to the dataset file. Default: "./dataset/en_de.txt".
--embedding_size: Size of the embedding layer. Default: 128.
--hidden_units: Number of hidden units in the model. Default: 128.
--epochs: Number of epochs for training. Default: 5.
--learning_rate: Learning rate for the optimizer. Default: 0.001.
--save_path: Path to save the trained model weights. Default: "./weights/custom_v1.keras".

Results

Since the last version of this model, the integration of Bahdanau Attention on seq2seq has yielded significant improvement in the overall performance. Translations have become more satisfying on long sentences. Repeatedly sampling a new translation could be considered as a user-friendly solution to find the best translation as well as test the model quality.

Here are some translated examples. Note that the German version has been modified to follow German grammar. As said, translations should not be expected to fully convey the meaning of the original sentence.

English German
This is the first translation. Danke bitte wahrend das sonst fischen.
I didn't go to his birthday yesterday. Er arzt gleich den originals uber in den weisen.
He tries to read a book every week. Er schwierig eine, banane wir gestern zu einem.

Deployment

A workshop on how to deploy this model can be found in the deploy-tf-model-on-aws repo.

References

[1] Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio (2014). Neural Machine Translation by Jointly Learning to Align and Translate. ICLR 2015.

[2] Ilya Sutskever, Oriol Vinyals, Quoc V. Le (2014). Sequence to Sequence Learning with Neural Networks. NIPS 2014.