Could you test TimeChat on the EgoShema dataset?

EricLina commented 3 months ago

Hello! I have been following the development of the TimeChat model and its application in long video processing. I understand that EgoShema serves as a crucial benchmark for evaluating long video question-answering (QA) systems. Considering the significance of EgoShema in the field, I would like to inquire whether there are plans to test the TimeChat model on the EgoShema dataset.

Testing the TimeChat model on EgoShema would not only provide valuable insights into its performance but also enable a comparative analysis with other long video QA models. I believe such evaluation would be greatly beneficial to the research community and further enhance the understanding of the TimeChat model's capabilities.

RenShuhuai-Andy commented 3 months ago

Hi, thanks for your interest.

We have conducted evaluation on some benchmarks (MVBench, TempCompass, see https://github.com/RenShuhuai-Andy/TimeChat/blob/master/docs/EVAL.md). The results on EgoShema will be added recently.

RenShuhuai-Andy commented 3 months ago

The accuracy of TimeChat-7b on EgoShema is 33%, see https://github.com/RenShuhuai-Andy/TimeChat/blob/master/docs/EVAL.md#egoschema

RenShuhuai-Andy / TimeChat

Could you test TimeChat on the EgoShema dataset? #34