facebookresearch / mlqe

We release a dataset based on Wikipedia sentences and the corresponding translations in 6 different languages along with the scores (scale 1 to 100) generated though human evaluations that represent the quality of the translations.Paper Title Unsupervised Quality Estimation for Neural Machine Translation
Creative Commons Attribution Share Alike 4.0 International
80 stars 14 forks source link

Questions about the training details of given NMT models #9

Closed Qbop981001 closed 2 years ago

Qbop981001 commented 2 years ago

Hello, thank you for providing the NMT models!

But can you tell me which specific dataset are the NMT models trained on?

In the official findings paper, they only say "Translations were produced with state-of-the- art transformer-based NMT models trained using publicly available data ", but which specific dataset are they trained on?

For example, for en-de model, is it trained on WMT14en-de, or WMT19ende? It would be appreciated if you offer more details about the training dataset, thanks!

Qbop981001 commented 2 years ago

Ah... Silly me. I have find link to specific datasets in the README.md. Sorry for bother!