Question on weights - Githubissues

siddhsql commented 1 year ago

Hello - thanks for this library. Great work!

I am trying to understand how to interpret the Open LLAMA weights. As I understand LLAMA is 2 things:

the NN model which is nothing but a deep graph of artificial neurons describing their connectivity and activation functions
the weights connecting these neurons (weights of edges in the graph)

what the Open LLAMA project did was to take the original LLAMA paper, then develop their own implementation of the NN (#1 above) followed by training this NN on the red pajama dataset. Is this correct?

Why not train the NN on the same dataset that was used to train the original LLAMA? Then you would get the same weights as original LLAMA (theoretically speaking).

snichols commented 1 year ago

Replicating the exact dataset based on the description in the LLaMA paper isn't possible unless you can replicate precisely what they've done in the paper. Check out section 2 in the paper for details: https://arxiv.org/pdf/2302.13971.pdf

young-geng commented 1 year ago

The original dataset of LLaMA was not released, so we cannot use that. We choose the RedPajama dataset which is a reproduction of the LLaMA dataset according to the details in the paper.

openlm-research / open_llama

Question on weights #56