22-hours / cabrita

Finetuning InstructLLaMA with portuguese data
Apache License 2.0
558 stars 68 forks source link

Translation from English + Finetuning vs. original LLama quality #1

Closed C00reNUT closed 1 year ago

C00reNUT commented 1 year ago

Hello,

Really cool idea, sadly I do not speak Portuguese, so I cannot evaluate the quality. Do you feel that it is more consistent than using pure LLama model?

May I ask what translation service did you use? Deepl or something else?

bui-thanh-lam commented 1 year ago

It is said that he used a paid plan of OpenAI's ChatGPT to translate.

pedrogengo commented 1 year ago

We used ChatGPT to translante in this first version. We didnt make a deeper evaluation, but after some tests we found the results okay. When we compared with llama we notice that llama sometimes make typos and, in this terms, our model became more consistent to return portuguese words. But again, we need to perform more tests :)

C00reNUT commented 1 year ago

Ok, thank you for letting me know. I need to check whether there are public benchmarks on DeepL translation, from what I know it seems to be superior among the free services. At least it would be interesting to compare https://github.com/OpenNMT/CTranslate2 with GTP3/ChatGPT translation quality.