Can transformer model reproduce WMT14 English-German BLEU score?

marian-nmt / marian-examples

Examples, tutorials and use cases for Marian, including our WMT-2017/18 baselines.

Other

78 stars 34 forks source link

Can transformer model reproduce WMT14 English-German BLEU score? #9

Closed SkyAndCloud closed 5 years ago

SkyAndCloud commented 6 years ago

Hi, thank you for great work and awesome documents. I have a question after I read your transformer example which is used on WMT2017 English-German corpus that if you have tested marian's performance using this example on WMT2014 English-German corpus and achieve equivalent BLEU score as reported in transformer original paper? I think this point is very important because only via this can you prove your transformer implementation is correct and it is also important for research use. Thanks!

emjotde commented 6 years ago

Sorry for missing this issue. Surely there are other ways we can prove that our implementation is correct, for instance by winning the WMT2018 shared task on news translation for English-German:

https://arxiv.org/abs/1809.00196

SkyAndCloud commented 6 years ago

Well, I have read your paper and here are my questions:

I think transformer example you provided is equivalent to transformer base model while you use big model in WMT2018, could you share your training config when using big model?
I'm new to neural language modeling and feel confused after reading FAQ about train lm-transformer on monolingual corpus. Could you provide an example of training a transformer style language model? Just config is OK. Thanks for your great work!

snukky commented 5 years ago

A config file for training Transformer Big has been mentioned here: https://github.com/marian-nmt/marian-dev/issues/298#issuecomment-420231508
A toy example for training Transformer-based LM is in our regression tests: https://github.com/marian-nmt/marian-regression-tests/blob/master/tests/training/lm/test_lm-transformer.sh

SkyAndCloud commented 5 years ago

Thanks a lot.