Why is this better than llama in some instances?

openlm-research / open_llama

OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset

Apache License 2.0

7.36k stars 374 forks source link

Why is this better than llama in some instances? #10

Open rick2047 opened 1 year ago

rick2047 commented 1 year ago

I was going through the readme and noticed here that this model is performing better than the 7B llama on many things, even though its trained on a fifth of the tokens (200B vs 1T). Does anyone understand how this happened?

ClaudeCoulombe commented 1 year ago

Probably GIGO (Garbage In Garbage Out), the two models are trained on different datasets.