Cannot reproduce VideoChatGPT generative performance results

dvlab-research / LLaMA-VID

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)

Apache License 2.0

693 stars 43 forks source link

Thank you for your contribution!

Hello, I'm trying to reproduce the evaluation scores for generative performance in the VideoChatGPT evaluation of model EVA-G & LLaVA1.5-VideoChatGPT-Instruct 7B. I have downloaded your codebase; aside from adjusting the video path, I have also changed the name of your LlavaConfig from "llava" to "llama_vid," because the former causes a conflict with my transformers package (version 4.41.2). All the other parts remain the same.

Reproduced result is shown below:

Hi, I’m currently trying to reproduce the results from the LlamaVid paper, but I’m having some difficulty because I don’t have access to the WebVid dataset. Would you be able to guide me on how to download or access the WebVid dataset? I’d really appreciate any help you could offer. Thank you so much in advance!

dvlab-research / LLaMA-VID

Cannot reproduce VideoChatGPT generative performance results #101