JUNJIE99 / MLVU

🔥🔥MLVU: Multi-task Long Video Understanding Benchmark
132 stars 0 forks source link

Issues for MLVU dataset #4

Closed jchsun1 closed 3 hours ago

jchsun1 commented 4 hours ago

Hello, thanks for your excellent work and I have some questions here:

  1. What's the difference between MLVU dataset and MLVU_Test dataset?
  2. I clone the project from GitHub to the local host, whether the corresponding data set in the project source code is MLVU or MLUV-Test?
  3. There are questions and candidates in the annotation of the MLVU-Test dataset, but no answers. How can I evaluate my model based on this dataset?

I'm looking forward to your answers. Thank you!

JUNJIE99 commented 4 hours ago

Hi,

Thank you for your interest.

  1. The MLVU dataset consists of two sets: the dev set and the test set. The multiple-choice questions in MLVU_Test include 6 options to increase the difficulty. Additionally, the test set does not provide ground truth to ensure a fairer evaluation. You can refer to the example at https://github.com/JUNJIE99/MLVU/blob/main/evaluation_test/test_res.json to organize your prediction results and submit the result file to us for evaluation.

  2. The repository contains evaluation code for both MLVU(dev) and MLVU_Test.

  3. Since ground truth is not provided for the MLVU Test set, please organize your prediction results according to the example at https://github.com/JUNJIE99/MLVU/blob/main/evaluation_test/test_res.json and submit them to us for evaluation.

Thank you.

jchsun1 commented 3 hours ago

Thanks for you quick reply and guidance, I can solve my problems now.