OpenNMT / OpenNMT-tf

Neural machine translation and sequence learning using TensorFlow
https://opennmt.net/
MIT License
1.45k stars 392 forks source link

Specific preparation to Arabic corpus #354

Closed M-HaDyS closed 5 years ago

M-HaDyS commented 5 years ago

Dears,

Is the any specific preparation can I perform before start to train and translate Arabic corpus ??

Thanks in advance.

guillaumekln commented 5 years ago

You could try tokenizing your corpus with SentencePiece which is language independent. All instructions are on the GitHub page:

https://github.com/google/sentencepiece

mohammedayub44 commented 5 years ago

@M-HaDyS did you happen to successfully run Arabic <-> English experiments using this Repo. I'm planning to do the same. Any suggestions you want to shed some light on ?

Thanks !

Raamyy commented 4 years ago

@mohammedayub44 Currently I also need to use openNMT to translate into arabic, did you try? :D