Seq2Seq models
This is a project to learn to implement different s2s model on tensorflow.
This project is only used for learning, which means it will contain many bugs. I suggest to use nmt project to do experiments and train seq2seq models. You can find it in the reference part.
Experiments
I am experimenting the copynet and pg on lcsts dataset, you can find the code in the lcsts branch.
Issues and suggestions are welcomed.
Models
The models I have implemented are as following:
- Basic seq2seq model
- A model with bi-direction RNN encdoer and attention mechanism
- Seq2seq model
- Same as basic model, but using tf.data pipeline to process input data
- GNMT model
- Residual conection and attention same as GNMT model to speed up training
- refer to GNMT for more details
- Pointer-Generator model
- CopyNet model
- A model also support copy mechanism
- refer to CopyNet for more details.
For the implement details, refer to ReadMe in the model folder.
Structure
A typical sequence to sequence(seq2seq) model contains an encoder, an decoder and an attetion structure. Tensorflow provide many useful apis to implement a seq2seq model, usually you will need belowing apis:
- tf.contrib.rnn
- tf.contrib.seq2seq
- Provided different attention mechanism and also a good implementation of beam search
- tf.data
- data preproces pipeline apis
- Other apis you need to build and train a model
Encoder
Use either:
- Multi-layer rnn
- use the last state of the last layer rnn as the initial decode state
- Bi-direction rnn
- use a Dense layer to convert the fw and bw state to the initial decode state
- GNMT encoder
- a bidirection rnn + serveral rnn with residual conection
Decoder
- Use multi-layer rnn, and set the inital state of each layer to initial decode state
- GNMT decoder
- only apply attention to the bottom layer of decoder, so we can utilize multi gpus during training
Attention
Metrics
Right now I only have cross entropy loss. Will add following metrics:
- bleu
- rouge
- for summarization problems
Dependency
Run
Run the model on a toy dataset, ie. reverse the sequence
train:
python -m bin.toy_train
inference:
python -m bin.toy_inference
Also you can run on en-vi dataset, refer to en_vietnam_train.py in bin for more details.
You can find more training scripts in bin directory.
Reference
Thanks to following resources: