Closed FrankChu0229 closed 1 year ago
According to this article, no:
https://aibusiness.com/meta/meta-s-llama-language-model-outperforms-openai-s-gpt-3
And LLaMA was not just built using solely English text. Meta trained its model using 20 languages that use Latin or Cyrillic scripts. However, most of the training data is in English so model performance for it is better.
I checked the tokenizer, and there're roughly 700 Chinese characters in the vocabulary. So, limited support indeed.
7B model
Prompt: 这首歌使我想起了我年轻的时候。
Output: “牙齿” 很有趣的名字。 The teeth have got you. The teeth have got you in their jaws. The teeth have got you in their teeth. The teeth have got you in their mouths. The teeth have got you in their claws. The teeth have got you in their nails. The teeth have got you in their fingernails. The teeth have got you in their tongues. The teeth have got you in their eyes. The teeth have got you in their ears. The teeth have got you in their noses. The teeth have got you in their nostrils. The teeth have got you in their lips. The teeth have got you in their mouths. The teeth have got you in their claws. The teeth have got you in their fingers. The teeth have got you in their fingernails. The teeth have got you in their toes. The teeth have got you in their soles. The teeth have got you in their heels. The teeth have got you in their feet. The teeth have got you in their kne
==================================
Summary: Doesn't support chinese. Input prompt was "This song reminds me of my youth" and output was "Teeth is an interesting name" followed by english statements about teeth.
Also tried out Japanese for the heck of it, doesn't support japanese either.
Prompt: 私の記憶は広告写真みたいになかしく通り過ぎてゆく。
Output: The Oatmeal | 30 Nov 2016
Hey, we've elaborated on this in the FAQ : https://github.com/facebookresearch/llama/blob/main/FAQ.md#2-generations-are-bad
I want to know if llama support Chinese, I can not run the model on my machine now, does anybody know this ?