rese1f / MovieChat

[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
https://rese1f.github.io/MovieChat/
BSD 3-Clause "New" or "Revised" License
534 stars 41 forks source link

Support Video Input with Different Resolution #83

Open oximi123 opened 1 month ago

oximi123 commented 1 month ago

Hi, does MovieChat only support 224x224 video input? Is there a way to input videos of other resolutions such as 1080p or 720p(without resizing the video frame)?

Espere-1119-Song commented 1 month ago

Hi, thank you for your question!

Currently, the base model of MovieChat (which utilizes VideoLLaMA) supports only 224×224 video input. However, we are actively working on an adaptation of MovieChat using the LLaVA-OneVision framework, which will support higher video resolutions such as 720p and 1080p.

The updated code will be released in the coming weeks.