Closed JY-CCK closed 6 months ago
Hello. First of all, thanks for providing LLaVA-Next training code.
I have a question. In the readme file, you recommend to finetune the entire model. Also, in the train.py, it tries to train the entire model according to print log.
if training_args.unfreeze_mm_vision_tower: lr_of_vit = training_args.mm_vision_tower_lr if training_args.mm_vision_tower_lr is not None else training_args.learning_rate lr_of_mlp = training_args.mm_projector_lr if training_args.mm_projector_lr is not None else training_args.learning_rate training_args.mm_projector_lr = lr_of_mlp unfreeze_vit(vision_tower) rank0_print( f'Tune the entire model! The LR of ViT is {lr_of_vit}. The LR of MLP is {lr_of_mlp}. The LR of LLM is {training_args.learning_rate}')
But, in your script, especially in finetune.sh, there is no 'tune_mm_mlp_adapte True'.
What is the right way to finetune llava models?
Thanks!
You don't need to set the tune_mm_mlp_adapter argument at fine-tune stage, the default behavior sets both the projector and the LLM to be trainable. You can debug to verify this behavior.
Hello. First of all, thanks for providing LLaVA-Next training code.
I have a question. In the readme file, you recommend to finetune the entire model. Also, in the train.py, it tries to train the entire model according to print log.
But, in your script, especially in finetune.sh, there is no 'tune_mm_mlp_adapte True'.
What is the right way to finetune llava models?
Thanks!