DAMO-NLP-SG / VideoLLaMA2

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
Apache License 2.0
752 stars 50 forks source link

⭐ [Feat] Support finetuning from previously saved checkpoint #13

Closed JingzheShi closed 3 months ago

JingzheShi commented 3 months ago

Dear authors:

Thank you for sharing the amazing work! I believe this would be an important base model for a lot of future works involving Video Language Models.

I'm currently trying to finetune on my own dataset of very small size, hence I would like to continue finetuning based on the chat model you provided. Current code, however, only fits the case where finetuning is carried out to the base model . (to my knowledge?)

I made a little adjustment to the code and support this case. If you don't think it is a good practice or you have better way to achieve this, please view this pull request as a small issue opened.

clownrat6 commented 3 months ago

Thanks for your code contribution. 🙏🙏