Our code will be made available upon acceptance. Stay tuned for updates!
The offical Pytorch implementation of Encode Once and Decode in Parallel: Efficient Transformer Decoding. Please refer to our paper for details. Encode Once and Decode in Parallel: Efficient Transformer Decoding. Bo-Ru Lu, Nikita Haduong, Chien-Yu Lin, Hao Cheng, Noah A. Smith, Mari Ostendorf. Preprint. 2024. [paper]
If you use any source codes included in this repository in your work, please cite the following paper. The bibtex is listed below:
@misc{lu2024encode,
title={Encode Once and Decode in Parallel: Efficient Transformer Decoding},
author={Bo-Ru Lu and Nikita Haduong and Chien-Yu Lin and Hao Cheng and Noah A. Smith and Mari Ostendorf},
year={2024},
eprint={2403.13112},
archivePrefix={arXiv},
primaryClass={cs.CL}
}