mbzuai-oryx / Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
https://mbzuai-oryx.github.io/Video-ChatGPT
Creative Commons Attribution 4.0 International
1.05k stars 92 forks source link

[benchmark] Some questions about the details to generate files in step 1 during `Video-based Generative Performance Benchmarking`. #47

Closed aopolin-lv closed 5 months ago

aopolin-lv commented 10 months ago

Hello, follow the instructions of step 1 in quantitative_evaluation, I obtain three files:

  1. one file generated with generic_qa.json and run_inference_benchmark_general.py
  2. one file generated with consistency_qa.json and run_inference_benchmark_consistency.py
  3. one file generated with temporal_qa.json and run_inference_benchmark_general.py

Then, do I need to generate any other file? And how does them function in step 2? More specifically, if I want to evaluate correctness and detailed, which file generated in step1 should I input to the pred_path in step 2 command?

hanoonaR commented 8 months ago

Hi @Aopolin-Lv,

Apologies for the late reply. Note that the question-answer pairs (gt_file) are the same for correctness, detailed orientation and Contextual understanding.

So, In your case, you want to evaluate the correctness and detailed orientation criteria: If you run the first step using the below command: the predictions are stored in --output_dir <output-dir-path>.

python video_chatgpt/eval/run_inference_benchmark_general.py \ --video_dir <path-to-directory-containing-videos> \ --gt_file <ground-truth-file-containing-question-answer-pairs> \ --output_dir <output-dir-path> \ --output_name <output-file-name> \ --model-name <path-to-LLaVA-Lightening-7B-v1-1> \ --projection_path <path-to-Video-ChatGPT-weights>

In order to evaluate the criteria, you will need to pass the same '--output_dir ' as the pred_path here.

Hope its clear now.