mbzuai-oryx / Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
https://mbzuai-oryx.github.io/Video-ChatGPT
Creative Commons Attribution 4.0 International
1.05k stars 92 forks source link

How to load the tuned backbone? #108

Open tqace opened 1 month ago

tqace commented 1 month ago

I trained the model following Training instructions, and the backbone is also fine-tuned, how do I save and load the backbone for further evaluation?

mmaaz60 commented 2 weeks ago

Hi @tqace,

I appreciate your interest in our work. You may have to uncomment the line at https://github.com/mbzuai-oryx/Video-ChatGPT/blob/11f10e2e3cae488803f153e4f5a93c6f4a33666f/video_chatgpt/train/llava_trainer.py#L48 to save the complete checkpoints in case if you are fine-tuning the backbone or LLM as well.

Please let me know if it works for you. Thank You.