OpenGVLab / Ask-Anything

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
MIT License
2.85k stars 230 forks source link
big-model captioning-videos chat chatgpt foundation-models gradio langchain large-language-models large-model stablelm video video-question-answering video-understanding

🦜 Ask-Anything [Paper]

Open in OpenXLab |

| | |
Open in Spaces [VideoChat-7B-8Bit] End2End ChatBOT for video and image.

[VideoChat2-7B]End2End ChatBOT for video and image.

δΈ­ζ–‡ README 及 中文亀桁羀 | Paper

πŸš€: We update video_chat by instruction tuning for video & image chatting now! Find its details here. We release instruction data at InternVideo. The old version of video_chat moved to video_chat_with_chatGPT.

⭐️: We are also working on a updated version, stay tuned!

:clapper: [End2End ChatBot]

:movie_camera: [Communication with ChatGPT]

:fire: Updates

πŸ”¨ Getting Started

Build video chat with:

:page_facing_up: Citation

If you find this project useful in your research, please consider cite:

  title={VideoChat: Chat-Centric Video Understanding},
  author={Li, Kunchang and He, Yinan and Wang, Yi and Li, Yizhuo and Wang, Wenhai and Luo, Ping and Wang, Yali and Wang, Limin and Qiao, Yu},
  journal={arXiv preprint arXiv:2305.06355},

:hourglass_flowing_sand: Ongoing

Our team constantly studies general video understanding and long-term video reasoning:

🌀️ Discussion Group

If you have any questions during the trial, running or deployment, feel free to join our WeChat group discussion! If you have any ideas or suggestions for the project, you are also welcome to join our WeChat group discussion!


We are hiring researchers, engineers and interns in General Vision Group, Shanghai AI Lab. If you are interested in working with us, please contact Yi Wang (