MastafaF / multilingual_similarity_compare

Other
5 stars 0 forks source link

Arabic and Urdu #9

Closed ghost closed 4 years ago

ghost commented 4 years ago

@MastafaF which machine translation code do you recommend for translating between arabic and urdu

MastafaF commented 4 years ago

Hey @deepseek ,

For pure machine translation, those multilingual architectures are not really useful as in a machine translation scenario, we would need the decoder part as well. In our use case, we focus on the encoding part in a way that a sentence in arabic and its translation in urdu would have the same encoding in the semantic space. The quality of such encoding is what is evaluated in this repo. Additional efforts have been done recently around languages such as Urdu with XLM-R.

What is your exact scenario? :)

N.B: Sorry I am currently having some time off so I would answer with some delay in the future.

ghost commented 4 years ago

Take this example, i have multiple books of the same author in Arabic and Urdu. A new book is issued in Arabic, I want to translate it to Urdu. i am sure that there must be a github repository that can translate such task.

MastafaF commented 4 years ago

Hey @deepseek

Yes there are but you should look at different architectures than BERT-like architectures which essentially just implements the encoding part.

Did you check with GPT-3 for example?