OpenGVLab / InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
Apache License 2.0
1.3k stars 84 forks source link

Method of running evaluation on MSR-VTT dataset #153

Open sartaki opened 1 month ago

sartaki commented 1 month ago

Thanks for the paper and the open sourcing the code base.

I would like to know how evaluation is performed on the MSR-VTT dataset for zero shot text to video retrieval.

Looking forward for your clarification. Thanks!

Andy1621 commented 1 month ago

Hi! In the latest version, we follow Unmasked Teacher to conduct the evaluation. Please check the code and meta data~

For Q1, we use 1k subset for testing. For Q2, only one caption for each video.

sartaki commented 1 month ago

Thanks @Andy1621. Will look at the link you pointed and get back to you if I have any doubts. Thanks.