zeroshot video-retrieval

OpenGVLab / InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Apache License 2.0

1.31k stars 85 forks source link

zeroshot video-retrieval #88

Closed 1240446371 closed 5 days ago

1240446371 commented 6 months ago

Thank you for your work！ But I have a question about zero shot video-retrieval task on activitynet dataset， which pretrain model I should use to reproduce the performance？Is Clip ViT-L-14.pt? Thank you for your response!

shepnerd commented 5 days ago

Apologies for the delayed response. In InternVideo1, we utilize CLIP-VIT for pretraining, whereas in InternVideo2, we train the vision model from scratch.