mbzuai-oryx / Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
https://mbzuai-oryx.github.io/Video-ChatGPT
Creative Commons Attribution 4.0 International
1.23k stars 108 forks source link

What is the size of the model compared in the experiment? #52

Closed jpthu17 closed 1 year ago

jpthu17 commented 1 year ago

Thank you for sharing the good work!

I'm very interested in your work, but I have some questions about the details of the experiment. In the following table, is the Video Chat version 13b or 7b? Also, is Video LLama version 13b or 7b? Looking forward to your reply.

image

mmaaz60 commented 1 year ago

Hi @jpthu17,

Thank you for your interest in our work. For conducting a fair comparison, we use 7B version of all the methods in our benchmark.

jpthu17 commented 1 year ago

Thank you for your prompt response. I appreciate your help.