Closed alvarobartt closed 1 month ago
Hi friend! We love to hear that people find this benchmark useful. We would like to host arena hard auto questions on our own huggingface. We are releasing a new set of the prompts soon and will continuously releasing new and new version for the community so it would be easier for us to maintain it on our huggingface. Thanks!
Hey here! That's great to hear, then I'll keep that private for me to test and wait until you release yours to point our examples there, thanks for the clarification!
Description
Hi here! Awesome job with Arena Hard 👏🏻 I just opened this issue since we, at @argilla-io, are currently exploring the usage of
distilabel
for running benchmarks such as Arena Hard, and we were wondering if it's a problem or if you have any issue with https://huggingface.co/datasets/alvarobartt/lmsys-arena-hard-v0.1 being hosted in the Hugging Face Hub on our end.We uploaded it there because we couldn't find the dataset per se in the Hub, but could only find https://huggingface.co/spaces/lmsys/arena-hard-browser with the
question.jsonl
file there.If there's already a dataset or you'd like us to transfer that to your org we'll happily do so, hope this is not an issue, but just let us know!
Congrats for the awesome job on evaluating LLMs again!