mbzuai-oryx / Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
https://mbzuai-oryx.github.io/Video-ChatGPT
Creative Commons Attribution 4.0 International
1.23k stars 108 forks source link

Question about the semi-automatic dataset creation process #105

Closed ooza closed 3 months ago

ooza commented 6 months ago

Hello, Thanks a lot for making available this amazing work! I'm interested in the semi-automatic dataset creation. Any useful detail about this framework will be much appreciated. The script generate_instruction_qa_semi_automatic.py requires

It will be useful to provide a running example of this script python generate_instruction_qa_semi_automatic.py ----gt_caption_file ... --pred_dir ... ?

mmaaz60 commented 5 months ago

Hi @ooza,

I appreciate your interest in our work. We recently released our work called VideoGPT+ and an improved semi-automatic video annotation pipeline. All the scripts to run the pipeline are also released.

Please check it out at GitHub, HuggingFace.

Please let me know if you have any questions. Good Luck!