EvolvingLMMs-Lab / lmms-eval

Accelerating the development of large multimodal models (LMMs) with lmms-eval
https://lmms-lab.github.io/
Other
1.03k stars 53 forks source link

token: False does not work in gqa.yaml #62

Closed baiyuting closed 2 months ago

baiyuting commented 2 months ago

I set token: False in gqa.yaml.

When running CUDA_VISIBLE_DEVICES=3 accelerate launch --num_processes=1 -m lmms_eval --model llava --model_args pretrained="/home/xxx/huggingface/liuhaotian/llava-v1.5-7b,device_map=auto,use_flash_attention_2=False" --tasks gqa --batch_size 1 --log_samples --log_samples_suffix reproduce --output_path ./logs/

I get an error:

huggingface_hub.utils._headers.LocalTokenNotFoundError: Token is required (`token=True`), but no token found. You need to provide a token or be logged in to Hugging Face with `huggingface-cli login` or `huggingface_hub.login`. See https://huggingface.co/settings/tokens.
04-24 20:50:35 [lmms-eval/lmms_eval/__main__.py:213] ERROR Error during evaluation: Token is required (`token=True`), but no token found. You need to provide a token or be logged in to Hugging Face with `huggingface-cli login` or `huggingface_hub.login`. See https://huggingface.co/settings/tokens.
Luodian commented 2 months ago

hi you need to set it to True

And you may need to use huggingface-cli login to connect your local environment to huggingface, it needs to download the gqa dataset.

Or another way is to manually download the gqa dataset (in repo form, in a folder), and change the lmms-lab/gqa to your local folder path.

baiyuting commented 2 months ago

Ok, I chose to manually download the gqadataset, and load it from my huggingface cache dir.
I also set token=False to avoid connecting the huggingface because there is no need to download the gqadataset. I find that it still throws error: huggingface_hub.utils._headers.LocalTokenNotFoundError: Token is required (token=True). It shows error info File "/home/yutingbai/test/lmms-eval/lmms_eval/tasks/gqa/utils.py", line 11, in gqa_doc_to_visual GQA_RAW_IMAGE_DATASET = load_dataset("lmms-lab/GQA", "testdev_balanced_images", split="testdev", token=True) So, I just set token=False and it is ok.

Besides, I test a model on gqadataset and get the following result:

image

So, does it mean that the accuracy is 1.7729%?

For llava-1.5-7B (lmms-eval), the value and Stderr in table is 0.6197328669 and 0.0043, respectively. So, the log should be like the following? Tasks Version Filter n-shot Metric Value Stderr
gqa Yaml none 0 exact_match 61.97328669 ± 0.0043

I want to figure it out.

kcz358 commented 2 months ago

It means 1.7729% accuracy. You can check the log to see how your model response. It is worth noting that exact match actually means for exact match. Any extra characters such as string or new line will causing the mark to be 0. We will add in some filter class in the next release for this.

baiyuting commented 2 months ago

Ok