RenShuhuai-Andy / TimeChat

[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
https://arxiv.org/abs/2312.02051
BSD 3-Clause "New" or "Revised" License
267 stars 23 forks source link

Could you test TimeChat on the EgoShema dataset? #34

Closed EricLina closed 3 months ago

EricLina commented 3 months ago

Hello! I have been following the development of the TimeChat model and its application in long video processing. I understand that EgoShema serves as a crucial benchmark for evaluating long video question-answering (QA) systems. Considering the significance of EgoShema in the field, I would like to inquire whether there are plans to test the TimeChat model on the EgoShema dataset.

Testing the TimeChat model on EgoShema would not only provide valuable insights into its performance but also enable a comparative analysis with other long video QA models. I believe such evaluation would be greatly beneficial to the research community and further enhance the understanding of the TimeChat model's capabilities.

RenShuhuai-Andy commented 3 months ago

Hi, thanks for your interest.

We have conducted evaluation on some benchmarks (MVBench, TempCompass, see https://github.com/RenShuhuai-Andy/TimeChat/blob/master/docs/EVAL.md). The results on EgoShema will be added recently.

RenShuhuai-Andy commented 3 months ago

The accuracy of TimeChat-7b on EgoShema is 33%, see https://github.com/RenShuhuai-Andy/TimeChat/blob/master/docs/EVAL.md#egoschema