meta-llama / llama

Inference code for Llama models
Other
56.34k stars 9.56k forks source link

I want to konw if llama support Chinese #58

Closed FrankChu0229 closed 1 year ago

FrankChu0229 commented 1 year ago

I want to know if llama support Chinese, I can not run the model on my machine now, does anybody know this ?

wupgop commented 1 year ago

According to this article, no:

https://aibusiness.com/meta/meta-s-llama-language-model-outperforms-openai-s-gpt-3

And LLaMA was not just built using solely English text. Meta trained its model using 20 languages that use Latin or Cyrillic scripts. However, most of the training data is in English so model performance for it is better.

garyfanhku commented 1 year ago

I checked the tokenizer, and there're roughly 700 Chinese characters in the vocabulary. So, limited support indeed.

wupgop commented 1 year ago

7B model

Prompt: 这首歌使我想起了我年轻的时候。

Output: “牙齿” 很有趣的名字。 The teeth have got you. The teeth have got you in their jaws. The teeth have got you in their teeth. The teeth have got you in their mouths. The teeth have got you in their claws. The teeth have got you in their nails. The teeth have got you in their fingernails. The teeth have got you in their tongues. The teeth have got you in their eyes. The teeth have got you in their ears. The teeth have got you in their noses. The teeth have got you in their nostrils. The teeth have got you in their lips. The teeth have got you in their mouths. The teeth have got you in their claws. The teeth have got you in their fingers. The teeth have got you in their fingernails. The teeth have got you in their toes. The teeth have got you in their soles. The teeth have got you in their heels. The teeth have got you in their feet. The teeth have got you in their kne

==================================

Summary: Doesn't support chinese. Input prompt was "This song reminds me of my youth" and output was "Teeth is an interesting name" followed by english statements about teeth.

wupgop commented 1 year ago

Also tried out Japanese for the heck of it, doesn't support japanese either.

Prompt: 私の記憶は広告写真みたいになかしく通り過ぎてゆく。

Output: The Oatmeal | 30 Nov 2016

  1. Your memory isn’t actually a perfect recording device. Your brain is great at filling in the gaps of what you don’t remember. You know, like when you remember a dream where you’re on a boat with a bunch of aliens who are going to eat you but then they turn out to be friendly and you’re best friends with them and that’s why you don’t remember the parts where they were going to eat you. But you also have a memory of what you ate for breakfast that morning, and that’s because your brain is filling in the gaps.
  2. We have a hard time remembering bad things that happened to us. And this is another way your memory fills in the gaps: If something bad happened to you, you might not remember the exact details of it. You might remember that it was bad, but you might not be able to remember what happened, which might lead you to forget it happened altogether.
  3. We tend to remember events in our lives in chronological order, but we don’t remember them that way when we’re recalling them. When you
timlacroix commented 1 year ago

Hey, we've elaborated on this in the FAQ : https://github.com/facebookresearch/llama/blob/main/FAQ.md#2-generations-are-bad