jayleicn / ClipBERT

[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks.
https://arxiv.org/abs/2102.06183
MIT License
697 stars 85 forks source link

How was the T set in the default setting? #34

Closed JianJuly closed 2 years ago

JianJuly commented 2 years ago

In Section 4.2 Analysis of Sparse Sampling, it is read If not otherwise stated, we randomly sample a single frame (Ntrain=1 and T=1) from full-length videos for training, and use the middle frame (Ntest=1) for inference, with input image size L=448. I am confused, if not otherwise stated in the following analysis, is T of training equals to T of test ? or T of test always equals to 1? Since i have noticed that there is no T_train or T_test.

jayleicn commented 2 years ago

Yes, T at test is always kept the same as training. Except for activitynet where we use T=2, the other datasets we use T=1.

JianJuly commented 2 years ago

Got it, thank you! @jayleicn

wintersurvival commented 2 years ago

Thank you! What parameter in config file is T? @JianJuly