[Question] About finetuning projector

xiaoachen98 / Open-LLaVA-NeXT

An open-source implementation for training LLaVA-NeXT.

398 stars 20 forks source link

if training_args.unfreeze_mm_vision_tower: lr_of_vit = training_args.mm_vision_tower_lr if training_args.mm_vision_tower_lr is not None else training_args.learning_rate lr_of_mlp = training_args.mm_projector_lr if training_args.mm_projector_lr is not None else training_args.learning_rate training_args.mm_projector_lr = lr_of_mlp unfreeze_vit(vision_tower) rank0_print( f'Tune the entire model! The LR of ViT is {lr_of_vit}. The LR of MLP is {lr_of_mlp}. The LR of LLM is {training_args.learning_rate}')

Hello. First of all, thanks for providing LLaVA-Next training code.

I have a question. In the readme file, you recommend to finetune the entire model. Also, in the train.py, it tries to train the entire model according to print log.
if training_args.unfreeze_mm_vision_tower:
        lr_of_vit = training_args.mm_vision_tower_lr if training_args.mm_vision_tower_lr is not None else training_args.learning_rate
        lr_of_mlp = training_args.mm_projector_lr if training_args.mm_projector_lr is not None else training_args.learning_rate
        training_args.mm_projector_lr = lr_of_mlp
        unfreeze_vit(vision_tower)
        rank0_print(
            f'Tune the entire model! The LR of ViT is {lr_of_vit}. The LR of MLP is {lr_of_mlp}. The LR of LLM is {training_args.learning_rate}')
But, in your script, especially in finetune.sh, there is no 'tune_mm_mlp_adapte True'.

What is the right way to finetune llava models?

Thanks!

You don't need to set the tune_mm_mlp_adapter argument at fine-tune stage, the default behavior sets both the projector and the LLM to be trainable. You can debug to verify this behavior.

xiaoachen98 / Open-LLaVA-NeXT

[Question] About finetuning projector #2