mbzuai-oryx / Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
https://mbzuai-oryx.github.io/Video-ChatGPT
Creative Commons Attribution 4.0 International
1.05k stars 92 forks source link

Can you provide the original video caption generating the instruction qa? #75

Open valencebond opened 6 months ago

valencebond commented 6 months ago

thanks for your great work, i just want to train a dense caption model with LLM, is there a way I can get the original dense video caption?

ninatu commented 4 months ago

I would be also interested in original dense video captions

mmaaz60 commented 2 weeks ago

Hi @valencebond @ninatu

I appreciate your interest in our work. We recently released our work called VideoGPT+ and an improved semi-automatic video annotation pipeline. All the scripts to run the pipeline are also released.

Please check it out at GitHub, HuggingFace.

Further VCG+-112K dataset is also released with separate json files for semi-automatic and human annotated videos. Check it out at HuggingFace.

ninatu commented 1 week ago

Hi @mmaaz60, Super! Thanks so much for an update!