openlm-research / open_llama

OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
Apache License 2.0
7.27k stars 370 forks source link

Question on weights #56

Closed siddhsql closed 1 year ago

siddhsql commented 1 year ago

Hello - thanks for this library. Great work!

I am trying to understand how to interpret the Open LLAMA weights. As I understand LLAMA is 2 things:

what the Open LLAMA project did was to take the original LLAMA paper, then develop their own implementation of the NN (#1 above) followed by training this NN on the red pajama dataset. Is this correct?

Why not train the NN on the same dataset that was used to train the original LLAMA? Then you would get the same weights as original LLAMA (theoretically speaking).

snichols commented 1 year ago

Replicating the exact dataset based on the description in the LLaMA paper isn't possible unless you can replicate precisely what they've done in the paper. Check out section 2 in the paper for details: https://arxiv.org/pdf/2302.13971.pdf

young-geng commented 1 year ago

The original dataset of LLaMA was not released, so we cannot use that. We choose the RedPajama dataset which is a reproduction of the LLaMA dataset according to the details in the paper.