facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MIT License
30.37k stars 6.4k forks source link

Suggestions on training on English to Vietnamese translation #458

Closed sugeeth14 closed 5 years ago

sugeeth14 commented 5 years ago

Hi , I am trying on English to Vietnamese translation using IWLST data from Stanford NLP which has 133K pairs. I want to replicate results presented here https://github.com/tensorflow/tensor2tensor/pull/611 where Transformer base architecture is used. I have a few quick questions

I know the questions are naive but I think it would help for new users as well who are trying on Vietnamese. I am working on single gpu setup.Please give some suggestions or leads on this kind of task and dataset. Thanks.

edunov commented 5 years ago

Hi @Raghava14

Unfortunately I can't give exact answers to your questions without proper experimentation, but here are some thoughts:

Hope that helps...

sugeeth14 commented 5 years ago

Hi @edunov , Firstly thanks a lot for suggestions I tried them following architectures for English to Vietnamese and got following results

Architecture BLEU
IWSLT DE_EN 31
Transformer_base 26

I think it is better than the one claimed here https://github.com/tensorflow/tensor2tensor/pull/611 so I am closing the issue now. But as a next step I want to see if I can increase any BLEU score with Understanding Back-Translation at Scale . But unlike Monolingual corpus you obtained from WMT18 there aren't much resources available for monolingual data in Vietnamese. I am trying to use wikipedia dump but not sure how it would work. If you have any suggestions or thoughts please share them so that I can see if there can be any improvement in BLEU. Thanks again for the help.

jiachangliu commented 5 years ago

Hi @Raghava14 ,

I tried IWSLT_DE_EN architecture for translation from English to Vietnamese. I'm able to produce BLEU score 26.86. Did you use joint dictionary or separate dictionary?

sugeeth14 commented 5 years ago

Hi @jiachangliu I used --joined-dictionary for English to Vietnamese translation.

jiachangliu commented 5 years ago

@Raghava14 Thank you very much. I used separate dictionaries. I will try --joined-dictionary to see if I can get better results.