Closed siddhsql closed 1 year ago
Replicating the exact dataset based on the description in the LLaMA paper isn't possible unless you can replicate precisely what they've done in the paper. Check out section 2 in the paper for details: https://arxiv.org/pdf/2302.13971.pdf
The original dataset of LLaMA was not released, so we cannot use that. We choose the RedPajama dataset which is a reproduction of the LLaMA dataset according to the details in the paper.
Hello - thanks for this library. Great work!
I am trying to understand how to interpret the Open LLAMA weights. As I understand LLAMA is 2 things:
what the Open LLAMA project did was to take the original LLAMA paper, then develop their own implementation of the NN (#1 above) followed by training this NN on the red pajama dataset. Is this correct?
Why not train the NN on the same dataset that was used to train the original LLAMA? Then you would get the same weights as original LLAMA (theoretically speaking).