MSR-VTT dataset split - Githubissues

salesforce / ALPRO

Align and Prompt: Video-and-Language Pre-training with Entity Prompts

BSD 3-Clause "New" or "Revised" License

186 stars 18 forks source link

MSR-VTT dataset split #6

Closed chaochen99 closed 2 years ago

chaochen99 commented 2 years ago

Hi, Thanks for sharing the code!

I saw "use 7k videos for training and report results on the 1k test split" in your paper. When I downloaded the MSR-VTT dataset, there are only 7K train sets and 3K test sets, but no val dataset. Could you share the code for dividing the dataset to avoid discrepancies in results?

Looking forward to your reply.

dxli94 commented 2 years ago

Once you downloaded and unzipped the data.zip, you will be able to find the annotations by split in e.g. msrvtt_ret directory. This partition follows the common partition protocol in previous works. Thanks.