soumendrak / MTEnglish2Odia

Machine Translation from English to Odia language.
https://mte2o.com
GNU General Public License v3.0
9 stars 7 forks source link

Finalize the input corpus pairs #9

Closed soumendrak closed 4 years ago

soumendrak commented 4 years ago

The input English-Odia pairs need to be finalized.
With how many pairs we are going to start.

We should go ahead with 5k curated high-quality pairs.
It's fine if the pairs are sentences, phrases or words.

The pairs need to be retrieved from all the Individual files and the Combined file.

The final dataset will be added into this repository.

soumendrak commented 4 years ago

Mozilla Pontoon English Odia pairs have been added to the consolidated file

The Parallel pairs count have been increased to 11,805.

a-parida12 commented 4 years ago

Is anyone working on this? Do you think I can work on this?

soumendrak commented 4 years ago

@a-parida12 this one is completed. You can see this folder for details: https://github.com/soumendrak/MTEnglish2Odia/tree/master/data/output/organised

Thanks for reaching out. You may look into the other issues or help to prepare step by step instructions for possible projects. Thank you. CC: @subhadarship