dvlab-research / LLaMA-VID

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)
Apache License 2.0
693 stars 43 forks source link

About evaluation on vqav2 dataset #63

Open liziming5353 opened 7 months ago

liziming5353 commented 7 months ago

May I ask how you evaluated on the vqav2 dataset? I couldn't find the annotation file for the test set on the official website.

yanwei-li commented 7 months ago

Hi, we report the VQAV2 test-dev results. You can evaluate it here