rese1f / MovieChat

[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
https://rese1f.github.io/MovieChat/
BSD 3-Clause "New" or "Revised" License
531 stars 41 forks source link

Hi, are there no "dense caption" and "answers" in the test annotation? #47

Open sunwhw opened 8 months ago

sunwhw commented 8 months ago

Thanks for your great work! Are there no "dense caption" and "answers" in the test annotation? And for the val set, do you plan to provide videos in mp4 format?

image
Espere-1119-Song commented 8 months ago

sure, for a fair comparison in the CVPR2024 workshop, we hide the dense caption and answers now. We will release them in the future (after the workshop).

sunwhw commented 8 months ago

Suck a quick response! Sincerely thank you!

sunwhw commented 8 months ago

And for the val set, do you plan to provide videos in mp4 format?

Espere-1119-Song commented 8 months ago

The videos of val set in huggingface are already in mp4 format

sunwhw commented 8 months ago

oo, it is collected in the https://huggingface.co/datasets/Enxin/MovieChat-1K-test, also no answer and dense caption?

Espere-1119-Song commented 8 months ago

sure, we will release the ground truth after the CVPR2024 workshop.

sunwhw commented 8 months ago

ok, Thanks!