microsoft / VideoX

VideoX: a collection of video cross-modal models
Other
967 stars 160 forks source link

About Frame Sampling #84

Closed yusufani closed 1 year ago

yusufani commented 1 year ago

Hi,

Thank you for X-Clip project. While running the code in Huggingface to try it out, I noticed that 8 frames are sampled sequentially. Do these frames have to be sequential or would it make sense to randomly get 8 frames in 1 second?

image

nbl97 commented 1 year ago

Thanks for your interest. In the original paper, the frames are sampled using a sparse strategy, i.e., the frames are uniformly sampled to capture the global information. In your code snippet, you can control the interval by frame_sample_rate. Hope this can help you.

yusufani commented 1 year ago

Thank you for your answer. I tried with a frame for each second and it works quite well 🥳

nbl97 commented 1 year ago

Pls feel free to ping me if there are further questions.