Beam Search Is very Slow in Transformer

asyml / texar

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

https://asyml.io

Apache License 2.0

2.39k stars 374 forks source link

Beam Search Is very Slow in Transformer #133

Open santhoshkolloju opened 5 years ago

santhoshkolloju commented 5 years ago

I have been using beam size of 3 and alpha 1.0 for beam search decoding looks like it is very slow . Greedy search takes around 30-40 seconds for generating a sequence of length 250 words. but beam search takes around 2 minutes ,

Can you help me improve the inference . i tried quantising the model to 8bits it decreased the size of the model but inference time still remains the same.

Any help is appreciated.

Thanks

ZhitingHu commented 5 years ago

The transformer beam-search is adapted from the official implementation (tensor2tensor). Not sure how it can speed up.

A possible way would be using a more efficient variant of transformer decoder (e.g., TransformerXL). We don't have the bandwidth at this point though. Any contributions are welcome

guotong1988 commented 3 years ago

Same question.