X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).
Other
1.03k
stars
111
forks
source link
Request for extracted features for MSVD and MSR-VTT #13
Dear authors
can you share your extracted features for msvd and msr-vtt, and their extracted settings?
Btw, I also want to extract these features for videos by myself. Therefore, it would be better if you can share the extracted scripts in this repo!
Thanks!
Dear authors can you share your extracted features for msvd and msr-vtt, and their extracted settings? Btw, I also want to extract these features for videos by myself. Therefore, it would be better if you can share the extracted scripts in this repo! Thanks!