mbzuai-oryx / Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
https://mbzuai-oryx.github.io/Video-ChatGPT
Creative Commons Attribution 4.0 International
1.23k stars 108 forks source link

select frames #125

Open Arbor334 opened 3 months ago

Arbor334 commented 3 months ago

Great project. I saw the frame selection code you wrote, which is an average selection. However, the questions and answers of several frames in the json file are not the corresponding questions and answers of 100 frames. What problems will there be in doing this, or is it reasonable to do so? 😄 :