Closed dguo98 closed 4 years ago
Thank you for your interest. I recently uploaded the training and evaluation log for IWSLT14 De-En to help you check the reproducing process. The expected valid ppl is 4.6+. Latest version of this github reposity gets 4.6+ valid ppl, but the BLEU score is not always the same, we will list the origin environment setting later
Thanks!! I'll try it out! @zhaoguangxiang
For WMT14, How did you use "compound splitting" exactly?
Thanks!! I'll try it out! @zhaoguangxiang
For WMT14, How did you use "compound splitting" exactly?
yes, compound splitting for wmt14 ende
Hi there, Thanks so much for the great work! I'm currently trying to reproduce IWSLT14-de-en (Prime model) results on a single P100 GPU. I follow the exact script at https://github.com/lancopku/Prime/blob/master/examples/parallel_intersected_multi-scale_attention(Prime)/README.md. However, I'm unable to reproduce the results. It gave me 100+ perplexity after training is finished, and the BLEU score is below 30.
Do you have any suggestions? What is the expected perplexity / curve?