DAMO-NLP-SG / Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
BSD 3-Clause "New" or "Revised" License
2.7k stars 243 forks source link

llm在两个阶段都是keep frozen吗? #160

Open Nastu-Ho opened 4 months ago

lixin4ever commented 3 months ago

是的 准确的说除了Q-former部分是learnable的,VideoLLaMA的其他组件在pretrain和sft阶段都是frozen的,但是我们在VideoLLaMA 2里面调整了训练策略,LLM部分在SFT阶段是更新的