mbzuai-oryx / Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
https://mbzuai-oryx.github.io/Video-ChatGPT
Creative Commons Attribution 4.0 International
1.05k stars 92 forks source link

Question about VideoInstruct_Dataset.json #69

Closed DarkLighter97 closed 5 months ago

DarkLighter97 commented 7 months ago

I found many words like "\nQA_GT_caption_based_noisy" in json file, what's this? Is it just some noisy text that haven't been cleaned out?

"q": "Can you briefly describe the content of the video for me?", "a": "Of course! The video features various shots of a city followed by a girl playing the violin. Another girl steps in and starts singing, which leads to several others joining in as well. A mime walks in with fireworks while the group still sings, and many around are seen laughing and smiling. Overall, the video features individuals playing the violin in different scenes and locations, making it quite enjoyable to watch.\nQA_GT_caption_based_noisy", "video_id": "v_GqWH2IYPw6U"

mmaaz60 commented 5 months ago

Hi @DarkLighter97,

This is the unfiltered noise from GPT-3.5 and can be cleaned out. Thanks