EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.
https://www.eleuther.ai
MIT License
7.12k stars 1.91k forks source link

ValueError occurs when try to evaluate task "bigbench_multiple_choice" #1742

Closed abzb1 closed 6 months ago

abzb1 commented 7 months ago

Hello,

I'm trying to evaluate some hf🤗 models on lm-eval. When I use the "bigbench_multiple_choice" task, I encounter a ValueError in certain subtasks. I'd appreciate help with resolving this.

below is my script lm_eval --model hf \ --model_args pretrained=allenai/OLMo-7B,trust_remote_code=true \ --tasks bigbench_multiple_choice \ --device cuda:0 \ --batch_size auto \ --log_samples \ --output_path logit_result

I also tried with some other models(llama 3, mistral ), but it still makes error like "Task: bigbench_conlang_translation_multiple_choice] has_training_docs and has_validation_docs are False, using test_docs as fewshot_docs but this is not recommended. .... File "