EvolvingLMMs-Lab / lmms-eval

Accelerating the development of large multimodal models (LMMs) with lmms-eval
https://lmms-lab.github.io/
Other
2.01k stars 148 forks source link

Tasks were not found: activitynetqa #239

Open jby20180901 opened 2 months ago

jby20180901 commented 2 months ago

2024-09-10 16:26:20.400 | WARNING | main:cli_evaluate_single:234 - --limit SHOULD ONLY BE USED FOR TESTING.REAL METRICS SHOULD NOT BE COMPUTED USING LIMIT. 2024-09-10 16:26:20.401 | INFO | main:cli_evaluate_single:269 - Evaluating on 1 tasks. 2024-09-10 16:26:20.401 | ERROR | main:cli_evaluate_single:275 - Tasks were not found: activitynetqa. Try lmms-eval --tasks list for list of available tasks 2024-09-10 16:26:20.401 | INFO | main:cli_evaluate_single:280 - Selected Tasks: [] Traceback (most recent call last): File "/home/jiangbaoyang/lmms-eval-main/lmms_eval/main.py", line 202, in cli_evaluate results, samples = cli_evaluate_single(args) File "/home/jiangbaoyang/lmms-eval-main/lmms_eval/main.py", line 298, in cli_evaluate_single results = evaluator.simple_evaluate( File "/home/jiangbaoyang/lmms-eval-main/lmms_eval/utils.py", line 434, in _wrapper return fn(*args, *kwargs) File "/home/jiangbaoyang/lmms-eval-main/lmms_eval/evaluator.py", line 86, in simple_evaluate assert tasks != [], "No tasks specified, or no tasks found. Please verify the task names." AssertionError: No tasks specified, or no tasks found. Please verify the task names. 2024-09-10 16:26:20.568 | ERROR | main:cli_evaluate:216 - Error during evaluation: No tasks specified, or no tasks found. Please verify the task names. Traceback (most recent call last): File "/home/jiangbaoyang/lmms-eval-main/lmms_eval/main.py", line 202, in cli_evaluate results, samples = cli_evaluate_single(args) File "/home/jiangbaoyang/lmms-eval-main/lmms_eval/main.py", line 298, in cli_evaluate_single results = evaluator.simple_evaluate( File "/home/jiangbaoyang/lmms-eval-main/lmms_eval/utils.py", line 434, in _wrapper return fn(args, **kwargs) File "/home/jiangbaoyang/lmms-eval-main/lmms_eval/evaluator.py", line 86, in simple_evaluate assert tasks != [], "No tasks specified, or no tasks found. Please verify the task names." AssertionError: No tasks specified, or no tasks found. Please verify the task names.

why?

YangYangGirl commented 2 months ago

I found that this dataset only supports single gpu testing...

kcz358 commented 2 months ago

Hi @jby20180901 , when a task is not found in TASK_REGISTRY, it may be caused by two reasons. First, it might because of that the config is wrongly written. Second, it might because some dependencies are missing or init lines for the task have errors that causing the task not being imported. In this setting, the error are most likely caused by the second case. You might wanna check whether you have HF_HOME set in your envs because we have to pick a folder to cache the unzipped videos and setting this env var explicitly is what we recommended.

To check the error messages, you can add --verbosity DEBUG to see the exact error messages otherwise we simply skip register this task and keep going for the evaluation.

kcz358 commented 2 months ago

Hi @YangYangGirl , I don't see any issue when trying to evaluate on multi-gpu, may I ask your settings?

image
YangYangGirl commented 2 months ago

accelerate launch --num_processes=8 --main_process_port 22345 \ -m lmms_eval \ --model llava_onevision \ --model_args pretrained=lmms-lab/llava-onevision-qwen2-0.5b-ov,conv_template=qwen_1_5,video_decode_backend=decord,model_name=llava_qwen,max_frames_num=32 \ --tasks activitynetqa \ --batch_size 1 \ --log_samples \ --log_samples_suffix llava_onevision \ --output_path ./logs/ image

YangYangGirl commented 2 months ago

when I try to evaluate the model on videomme dataset using the command below, it works: accelerate launch --num_processes=8 --main_process_port 22345 \ -m lmms_eval \ --model llava_onevision \ --model_args pretrained=lmms-lab/llava-onevision-qwen2-0.5b-ov,conv_template=qwen_1_5,video_decode_backend=decord,model_name=llava_qwen,max_frames_num=32 \ --tasks videomme \ --batch_size 1 \ --log_samples \ --log_samples_suffix llava_onevision \ --output_path ./logs/ image

YangYangGirl commented 2 months ago

Thanks for the kind reply. But there seems to be some problems with the integration of activityqa dataset : )

jby20180901 commented 2 months ago

你好@jby20180901,当一个任务在 TASK_REGISTRY 中找不到时,可能有两个原因。首先,可能是因为配置写错了。其次,可能是因为缺少一些依赖项或任务的 init 行有错误导致任务无法导入。在这个设置中,错误很可能是由第二种情况引起的。你可能要检查一下你HF_HOME的 envs 中是否已经设置了,因为我们必须选择一个文件夹来缓存解压后的视频,我们建议明确设置这个环境变量。

要检查错误消息,您可以添加--verbosity DEBUG以查看确切的错误消息,否则我们只需跳过注册此任务并继续进行评估。

我找到问题所在了,activitynetqa的util.py里面的hf_home文件夹不存在

kcz358 commented 2 months ago

@YangYangGirl Hi, my commands are similar to you but I can't reproduce the result on my side

TASK=$1
CKPT_PATH=$2

echo $TASK
TASK_SUFFIX="${TASK//,/_}"
echo $TASK_SUFFIX

accelerate launch --num_processes 8 --main_process_port 12345 -m lmms_eval \
    --model llava_onevision \
    --model_args pretrained=$CKPT_PATH,conv_template=qwen_1_5,model_name=llava_qwen,mm_spatial_pool_mode=bilinear \
    --tasks $TASK \
    --batch_size 1 \
    --log_samples \
    --log_samples_suffix ${TASK_SUFFIX} \
    --output_path ./logs/

I checked again on current main branch that activitynetqa can work with llava onevision

image

You might wanna check your environment with the accelerate or there might be other errors occur during loading the files

YangYangGirl commented 2 months ago

I delete the activityqa dataset and re-download it, then it works! Thanks for your patience.

KawaiiNotHawaii commented 1 week ago

Does the evaluation of this activitynetqa require gpt calling? It uses traditional calculation-based metrics in the paper but when running in lmms-eval, it keeps throwing gpt calling errors (yeah I didn't set the openai api key) and ends with no results. Screenshot 2024-11-06 at 10 40 46