Closed zhengrongz closed 2 months ago
Yes, I believe it is fine to do so. In our internal evaluations, we found that our video models can generalize well to longer input (i.e., more input frames), and they usually perform better given the longer input.
You can specify this argument explicitly in your own script to support the training with more video frames.
@lixin4ever OK thank you!
Hi! Thanks for your excellent work! I wonder know whether I can use 32 frames per video to finetune model on my own dataset? If true, do I just need to change the number of sampled frames in constant? Looking forward to your reply!