Open BaohaoLiao opened 1 year ago
Hi, @BaohaoLiao. Thanks for your question!
LLaMA utilizes Wikipedia for its pre-training data which includes 20 languages: bg, ca, cs, da, de, en, es, fr, hr, hu, it, nl, pl, pt, ro, ru, sl, sr, sv, uk. We could find these information in the Wikipedia section of LLaMA paper.
Hope it can help you!
Hi,
I have a question about the paper. In your paper, you stated "LLaMA that covers only 20 languages" in the abstract. May I ask where you obtained this number? And what languages are these 20?