PKU-YuanGroup / LanguageBind

【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
https://arxiv.org/abs/2310.01852
MIT License
549 stars 44 forks source link

视频特征的提取支持动态帧数吗,效果相对于8帧会有下降或者变差吗 #27

Closed 1093842024 closed 4 months ago

LinB203 commented 4 months ago

Thank you for your attention. The input of the extra 8 frames is not supported at this time. We have not done ablation experiments in this area.