SwordYork / DCNMT

Deep Character-Level Neural Machine Translation
GNU General Public License v3.0
72 stars 19 forks source link

FR-EN Datasets #12

Closed kadir-gunel closed 7 years ago

kadir-gunel commented 7 years ago

Hello @SwordYork ,

Could you please share the data for training EN-FR translation system ?

Thanks in advance Kadir

SwordYork commented 7 years ago

You could download it from wmt. There are many pairs of languages.

kadir-gunel commented 7 years ago

I checked the site. But at the same time, in your github page you gave a reference to another paper for the dataset and in that paper they made some interesting stuff; first they combined multiple sets and then from that set they took a subset which does not permit me to replicate the same data.

SwordYork commented 7 years ago

I see, it is preprocessed by schwenk. You could check it.

SwordYork commented 7 years ago

By the way, you could send email to me for convenience.

kadir-gunel commented 7 years ago

Ok. Thanks.