What differences between ViCLIP, internVideo, and internVideo2?

OpenGVLab / InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Apache License 2.0

1.44k stars 88 forks source link

Hi, thank you for your interest in our InternVideo series. InternVideo was first open-sourced in late 2022 as a video foundation model to explore generative and discriminative learning in the field of video representation. In addition, in 2023 we collected a large-scale video text dataset, InternVid, and trained a video-text contrast learning model ViCLIP based on Internvid. In early 2024, we released InternVideo2, focusing more on video multi-modal learning representation, and using more data and more modalities, the performance has been further improved.

OpenGVLab / InternVideo

What differences between ViCLIP, internVideo, and internVideo2? #170