Hi, Shizhe, thanks for your great work. I downloaded the MSR-VTT dataset you provided, and I have a question. I found that not every video corresponds to 20 captions. Some videos only correspond to less than 20 captions. I would like to ask if you specifically selected these captions and how to choose them?
Hi, Shizhe, thanks for your great work. I downloaded the MSR-VTT dataset you provided, and I have a question. I found that not every video corresponds to 20 captions. Some videos only correspond to less than 20 captions. I would like to ask if you specifically selected these captions and how to choose them?