Open sdtblck opened 1 year ago
Thanks for your interest! We plan to update the arxiv with the full evaluations soon.
For now, we have the PPL of the 2.7B model against GPT-Neo-2.7B on the Pile:
Model | Pile PPL |
---|---|
GPT-Neo-2.7B | 5.7 |
H3 + 3 attn (2.7B) | 5.4 |
We'll be updating with evaluations of everything else soon (after this week's ICML deadline).
This is updated in the arXiv now: https://arxiv.org/abs/2212.14052
Hi, great work!
Very excited to try out the models.
Curious if you have more detailed evaluation for the 2.7B model, as I can't find this in the H3 paper