Open PromptExpert opened 4 weeks ago
Thanks for your attention.
The title you mentioned above refers to the first version of VideoLLaMA while this repo is for the second version (i.e., VideoLLaMA 2). We are still drafting the technical report for VideoLLaMA 2 and it should be available by the end of this week.
looking forward to see VideoLLaMA2 report/paper! Wondering what's the difference with VideoLLaMA.
@MoonBlvd Glad to hear this! The differences from VideoLLaMA 1 are basically improved architectural designs, stronger models (will be open-sourced for sure), and a much much more user-friendly codebase for training and evaluating VideoLLMs (which I think is the most beneficial part for the community), please stay tuned :grin:
link text: VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs Actual title of the paper: Video-LLaMA An Instruction-tuned Audio-Visual Language Model for Video Understanding