ArrowLuo / CLIP4Clip

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
https://arxiv.org/abs/2104.08860
MIT License
889 stars 125 forks source link

Time of training in MSR-VTT dataset. #8

Closed BlueCat7 closed 3 years ago

BlueCat7 commented 3 years ago

when I training in MSR-VTT dataset, it is very slow. My videos are 720P, maybe it's too large. However when I resize all videos to short size is 256, it's also not very fast. So, what's resolution of your MSR-VTT videos. How long does it take to train in MSR-VTT. Thanks!

ArrowLuo commented 3 years ago

It is indeed slow for end-to-end training. The sim_header with meanP on MSR-VTT will cost 1h and 30min per epoch in our experiments. We down-sampled the video in advance to reduce the time cost. We did not change the resolution but it worths to testing.

BlueCat7 commented 3 years ago

OK,thank you. I also try to save videos as tensor off-line, it should be faster.

jayleicn commented 2 years ago

Hi @BlueCat7, could you share a copy of the 720p MSRVTT videos? Thanks!

ArrowLuo commented 2 years ago

Hi @jayleicn, I think you can find what you want from the sharing of Frozen️ in Time, i.e.,

wget https://www.robots.ox.ac.uk/~maxbain/frozen-in-time/data/MSRVTT.zip

We have updated this link in the readme. Thanks to their contributions.

jayleicn commented 2 years ago

Thanks @ArrowLuo! I have downloaded the videos from the provided link, but the videos are 240p, instead of 720p.