Closed xiaoyunwu closed 11 months ago
what is the commonsense score with 1T?
Model | Pretrain Tokens | HellaSwag | Obqa | WinoGrande | ARC_c | ARC_e | boolq | piqa | avg |
---|---|---|---|---|---|---|---|---|---|
Pythia-1.0B | 300B | 47.16 | 31.40 | 53.43 | 27.05 | 48.99 | 60.83 | 69.21 | 48.30 |
TinyLlama-1.1B-intermediate-step-50K-104b | 103B | 43.50 | 29.80 | 53.28 | 24.32 | 44.91 | 59.66 | 67.30 | 46.11 |
TinyLlama-1.1B-intermediate-step-240k-503b | 503B | 49.56 | 31.40 | 55.80 | 26.54 | 48.32 | 56.91 | 69.42 | 48.28 |
TinyLlama-1.1B-Chat-v0.1 | 503B | 53.81 | 32.20 | 55.01 | 28.67 | 49.62 | 58.04 | 69.64 | 49.57 |
TinyLlama-1.1B-intermediate-step-480k-1007B | 1007B | 52.54 | 33.40 | 55.96 | 27.82 | 52.36 | 59.54 | 69.91 | 50.22 |
I will update the repo later
Wow, so chinchilla scaling needs help! Other than that though, every benchmark went up, congrats!
Also could you remove the chat model table. Makes the table look like you went down on benchmarks, but you realize its the chat model.
Makes the table look like you went down on benchmarks, but you realize its the chat model.
@VatsaDev Sure. Thanks for the advice.
@jzhang38 I am curious why did not have a instruction tuning phase, instead go directly to chat? Is support chat in video game primary drive for you to work on this? But I think with small model, I think instruction tuning a more important. As the amount of world it can model is limited anyways.
By the way, I think instead of wasting compute to large models, this project is what we really need, trying to figure out what we can get by throwing more compute to small models. If we can get RAG works well with small models, I think this will have a huge implication. And I think it is very possible.
@jzhang38 thanks for the change, clearly shows the progress now @xiaoyunwu we talk about RAG/toolformer, in #10
Thank you for your inquiry regarding the 1T checkpoint. As of now, you can explore and test our existing checkpoints (already including 1T checkpoint) on Hugging Face through the following link: tinyLlama-intermediate-checkpoints. We are in the process of training our TinyLlama chat model, which will be available in the near future. And we will also create a TinyLlama-1.1B-checkpoint for 1T token repo later.