Kyubyong / transformer

A TensorFlow Implementation of the Transformer: Attention Is All You Need
Apache License 2.0
4.27k stars 1.29k forks source link

How to improve the result? (already add beam search) #110

Open trx14 opened 5 years ago

trx14 commented 5 years ago

I want to improve transformer model. I think tensor2tensor is too big to change, so I choose this code. First, I have to reproduce the previous result. I add beam search by myself(before adding the beam search, the Bleu result was 28.34) and train model on iwslt14 de-en database(many papers use this database, not the iwslt16 de-en). I set all hyper parameter same as the base model in original paper. But I only get 30.34 Bleu, however many papers said they could get 32.86. Did some one know how to improve the result? Or did some other public transformer projects easy to change?

yaoyiran commented 5 years ago

Hi, I am also trying to add the beam search function. Would you mind sharing how you implemented the beam search part?

ZhichaoOuyang commented 4 years ago

Hello! Would you mind sharing how you implemented the beam search part? I want to learn it.

ZhichaoOuyang commented 4 years ago

I want to improve transformer model. I think tensor2tensor is too big to change, so I choose this code. First, I have to reproduce the previous result. I add beam search by myself(before adding the beam search, the Bleu result was 28.34) and train model on iwslt14 de-en database(many papers use this database, not the iwslt16 de-en). I set all hyper parameter same as the base model in original paper. But I only get 30.34 Bleu, however many papers said they could get 32.86. Did some one know how to improve the result? Or did some other public transformer projects easy to change?

Hello! Would you mind sharing how you implemented the beam search part?