This is the PyTorch implementation of paper: Incremental Transformer with Deliberation Decoder for Document Grounded Conversations. Zekang Li, Cheng Niu, Fandong Meng, Yang Feng, Qian Li, Jie Zhou. ACL 2019.
This code has been written based on OpenNMT-py. If you use any source code included in this repo in your work, please cite the following paper.
@inproceedings{zekangli2018incremental,
author = {Zekang Li, Cheng Niu, Fandong Meng, Yang Feng, Qian Li, Jie Zhou},
booktitle = {Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics},
title = {Incremental Transformer with Deliberation Decoder for Document Grounded Conversations},
year = {2019}
}
and
@inproceedings{opennmt,
author = {Guillaume Klein and
Yoon Kim and
Yuntian Deng and
Jean Senellart and
Alexander M. Rush},
title = {Open{NMT}: Open-Source Toolkit for Neural Machine Translation},
booktitle = {Proc. ACL},
year = {2017},
url = {https://doi.org/10.18653/v1/P17-4012},
doi = {10.18653/v1/P17-4012}
}
Document Grounded Conversations is a task to generate dialogue responses when chatting about the content of a given document. Obviously, document knowledge plays a critical role in Document Grounded Conversations, while existing dialogue models do not exploit this kind of knowledge effectively enough. In this paper, we propose a novel Transformer-based architecture for multi-turn document grounded conversations. In particular, we devise an Incremental Transformer to encode multi-turn utterances along with knowledge in related documents. Motivated by the human cognitive process, we design a two-pass decoder (Deliberation Decoder) to improve context coherence and knowledge correctness. Our empirical study on a real-world Document Grounded Dataset proves that responses generated by our model significantly outperform competitive baselines on both context coherence and knowledge relevance.
We use the dataset proposed in A Dataset for Document Grounded Conversations. As there is a little overlap between training set and testing set in the original dataset, we remove the duplicates and format the data for our model. Please download the processed data here.
pip install -r requirements.txt
Preprocess
python preprocess.py \
--train_src data/src-train-tokenized.txt \
--valid_src data/src-valid-tokenized.txt \
--train_knl data/knl-train-tokenized.txt \
--valid_knl data/knl-valid-tokenized.txt \
--train_tgt data/tgt-train-tokenized.txt \
--valid_tgt data/tgt-valid-tokenized.txt \
--save_data data/cmu_movie \
-dynamic_dict \
-share_vocab \
-src_seq_length_trunc 50 \
-tgt_seq_length_trunc 50 \
-knl_seq_length_trunc 200 \
-src_seq_length 150 \
-knl_seq_length 800
Train
python train.py -config config/config-transformer-base-1GPU.yml
Generate
python translate.py \
--src data/src-test-tokenized.txt \
--tgt data/tgt-test-tokenized.txt \
--knl data/knl-test-tokenized.txt \
--model models/base_model_step_20000.pt \
--output pred.txt \
-replace_unk \
-report_bleu \
-dynamic_dict \
-gpu 1 \
-batch_size 32
We add knowledge input support into OpenNMT-py. For more usage, please refer to OpenNMT-py.