Closed petroskarypis closed 4 days ago
Hey,
the benchmarks in Appendix C are publically available benchmarks (see open LLM leaderboard and HELM).
If you are looking for the tinyBenchmarks, you can find them on here.
Hope this helps!
Hey, thanks for the reply!
I was referring to those benchmarks in Appendix C. While you describe the preprocessing steps you used, having your version of them would be useful for reproducibility. For example HELM Lite currently only lists 30 models vs. the 37 mentioned in the appendix.
Hi @petroskarypis,
Now we understand what you mean. We will release the datasets as soon as possible in our GitHub repo. If you want I can send you via email in the meantime. Just email felipemaiapolo@gmail.com and I will reply with the datasets.
Thanks! That would be great.
Hi, thanks for making this interesting work open-source! Are you guys planning to release the collection of model benchmarks described in Appendix C?