openlm-research / open_llama

OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
Apache License 2.0
7.36k stars 374 forks source link

LLaMA 3B configuration #49

Closed LamOne1 closed 1 year ago

LamOne1 commented 1 year ago

Hello,

first I'd like to thank you for providing these weights! amazing work!

I want to know what is the configuration of LLaMA3B, specifically, what are the numbers of 1)layers, 2)heads, 3)dimension?

young-geng commented 1 year ago

https://huggingface.co/openlm-research/open_llama_3b/blob/main/config.json

LamOne1 commented 1 year ago

Hi @young-geng, I created the architecture using lit-llama by lightning, but there was a problem, the dimension of the produced layer "mlp.gate_proj.weight" or "mlp.c_fc1.weight" is 8704, while in the chechpoint it's 8640.

I appreciate your help.

young-geng commented 1 year ago

The 3B size is not a standard LLaMA size, so different libraries have the freedom to define their own architectures.