Closed alvarobartt closed 1 month ago
Hi friend! We love to hear that people find this benchmark useful. We would like to host arena hard auto questions on our own huggingface. We are releasing a new set of the prompts soon and will continuously releasing new and new version for the community so it would be easier for us to maintain it on our huggingface. Thanks!
Hey here! That's great to hear, then I'll keep that private for me to test and wait until you release yours to point our examples there, thanks for the clarification!
Hi here @CodingWithTim! I've just re-opened this issue to ask about the progress of hosting the dataset within the https://huggingface.co/lmsys organization on the Hugging Face Hub.
Is there any short-mid term plan on doing so? If that's the case, we would be happy to help if needed!
@alvarobartt Thanks for letting us know! I just pushed the dataset to LMSYS huggingface link! Glad to see this being helpful to folks! Feel free to give us feedback anytime!
Description
Hi here! Awesome job with Arena Hard 👏🏻 I just opened this issue since we, at @argilla-io, are currently exploring the usage of
distilabel
for running benchmarks such as Arena Hard, and we were wondering if it's a problem or if you have any issue with https://huggingface.co/datasets/alvarobartt/lmsys-arena-hard-v0.1 being hosted in the Hugging Face Hub on our end.We uploaded it there because we couldn't find the dataset per se in the Hub, but could only find https://huggingface.co/spaces/lmsys/arena-hard-browser with the
question.jsonl
file there.If there's already a dataset or you'd like us to transfer that to your org we'll happily do so, hope this is not an issue, but just let us know!
Congrats for the awesome job on evaluating LLMs again!