Closed kig1929 closed 4 months ago
Sorry for a bit late. The customized lora parameters for pre-training and fine-tuning are in finetune.py
line 317.
The figure 1(a) is just a schematic. We actually do not include new architecture, and all modules on the right side indicating the LVLM (i.e., Qwen-VL).
finetune.py
employs lora fine-tuning for parameters containing names in line 317. You may print the target_modules
in line 327 to check which modules are updated in ViT, adapter, and LLM.
Thank you for answer!
Just to double check, are you saying that the target weights that is updated in pretrain and finetuning are the same?
Yes. Due to the difference between the downstream task screens and the pre-trained screens, we find that the performance of lora with the visual coder is slightly better when fine-tuning.
Thanks for your prompt reply:) It helped me a lot.
Thanks for sharing good works:)
When dividing Qwen-VL into ViT, adapter, and LM, can you clarify which weights are updated while pretraining and finetuning?
Also, I have a question for confirmation. In Figure 1 (a) of the paper, ViT and VL Adapter are not included in LVLM(yellow box). I think the yellow box is LM, but is it LVLM?