YehLi / xmodaler

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).
Other
1.03k stars 111 forks source link

Request for extracted features for MSVD and MSR-VTT #13

Closed Lechatelia closed 3 years ago

Lechatelia commented 3 years ago

Dear authors can you share your extracted features for msvd and msr-vtt, and their extracted settings? Btw, I also want to extract these features for videos by myself. Therefore, it would be better if you can share the extracted scripts in this repo! Thanks!

YehLi commented 3 years ago

I have uploaded the msvd and msr-vtt features to the msvd_dataset/msrvtt_dataset folder (https://drive.google.com/drive/folders/1vx9n7tAIt8su0y_3tsPJGvMPBMm8JLCZ?usp=sharing).

Lechatelia commented 3 years ago

Thanks for your help!