stanford-crfm / helm

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in Holistic Evaluation of Text-to-Image Models (HEIM) (https://arxiv.org/abs/2311.04287).
https://crfm.stanford.edu/helm
Apache License 2.0
1.8k stars 239 forks source link

Unable to download HELM leaderboard results (v1.3.0) #2680

Open jasonwright38 opened 2 months ago

jasonwright38 commented 2 months ago

Bug Hi, I am unable to follow the instructions here: https://crfm-helm.readthedocs.io/en/v0.5.1/get_helm_rank/#download-helm-leaderboard-results to download HELM leaderboard results, for the latest version v1.3.0.

This works without issue for the version described in the documentation (v0.3.0), but does not work for any of the later versions.

Error (venv) ➜ ~ export LEADERBOARD_VERSION=v1.3.0 (venv) ➜ ~ curl -O https://storage.googleapis.com/crfm-helm-public/benchmark_output/archives/$LEADERBOARD_VERSION/run_stats.zip % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 225 100 225 0 0 299 0 --:--:-- --:--:-- --:--:-- 300 (venv) ➜ ~ mkdir -p benchmark_output/runs/$LEADERBOARD_VERSION && unzip run_stats.zip -d benchmark_output/runs/$LEADERBOARD_VERSION Archive: run_stats.zip End-of-central-directory signature not found. Either this file is not a zipfile, or it constitutes one disk of a multi-part archive. In the latter case the central directory and zipfile comment will be found on the last disk(s) of this archive. unzip: cannot find zipfile directory in one of run_stats.zip or run_stats.zip.zip, and cannot find run_stats.zip.ZIP, period.

Any support or advice with this issue is much appreciated.

Thanks.

yifanmai commented 2 months ago

Hi, we no longer support zip files for recent versions of HELM.

jasonwright38 commented 2 months ago

Hi @yifanmai, thanks for your reply. Is there a supported file format for recent HELM versions? If so, is there a command I can use to download recent sets of results (v1.3.0 preferably)? Thanks in advance.

mayankjobanputra commented 2 months ago

Hey we are also interested in getting the results for v1.3 and unable to download the results.

yifanmai commented 2 months ago

Hi @jasonwright38 and @mayankjobanputra, you can download the raw results by following these instructions in the documentation. Let me know if you have any more questions.

jasonwright38 commented 2 months ago

Thanks for getting back to me @yifanmai, was able to follow those instructions and import v1.3.0 results. Much appreciated!

mayankjobanputra commented 2 months ago

Me too! Thanks @yifanmai

mayankjobanputra commented 1 month ago

@yifanmai for me when I downloaded the lite version, a lot of models are missing which are still present on the website somehow.

Am I doing something wrong? I followed the steps below:

export GCS_BENCHMARK_OUTPUT_PATH=gs://crfm-helm-public/lite/benchmark_output

gsutil -m rsync -r $GCS_BENCHMARK_OUTPUT_PATH/ $LOCAL_BENCHMARK_OUTPUT_PATH/

It downloaded about 14GB of data.

<>:/fast_ssd/HELM$ ls
releases  runs

<>:/fast_ssd/HELM$ cd runs/

<>:/fast_ssd/HELM/runs$ ls
v1.0.0  v1.1.0  v1.2.0  v1.3.0  v1.3.0-canary  v1.4.0  v1.4.0-canary
yifanmai commented 1 week ago

Hi @mayankjobanputra, which models are missing from your download?