lm-sys / arena-hard-auto

Arena-Hard-Auto: An automatic LLM benchmark.
Apache License 2.0
316 stars 29 forks source link

[Q] About hosting `arena-hard-v0.1/question.json` in the Hugging Face Hub #19

Closed alvarobartt closed 1 month ago

alvarobartt commented 1 month ago

Description

Hi here! Awesome job with Arena Hard 👏🏻 I just opened this issue since we, at @argilla-io, are currently exploring the usage of distilabel for running benchmarks such as Arena Hard, and we were wondering if it's a problem or if you have any issue with https://huggingface.co/datasets/alvarobartt/lmsys-arena-hard-v0.1 being hosted in the Hugging Face Hub on our end.

We uploaded it there because we couldn't find the dataset per se in the Hub, but could only find https://huggingface.co/spaces/lmsys/arena-hard-browser with the question.jsonl file there.

If there's already a dataset or you'd like us to transfer that to your org we'll happily do so, hope this is not an issue, but just let us know!

Congrats for the awesome job on evaluating LLMs again!

CodingWithTim commented 1 month ago

Hi friend! We love to hear that people find this benchmark useful. We would like to host arena hard auto questions on our own huggingface. We are releasing a new set of the prompts soon and will continuously releasing new and new version for the community so it would be easier for us to maintain it on our huggingface. Thanks!

alvarobartt commented 1 month ago

Hey here! That's great to hear, then I'll keep that private for me to test and wait until you release yours to point our examples there, thanks for the clarification!