dvlab-research / MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
Apache License 2.0
3.18k stars 279 forks source link

how to use stage2 ckpt fine-tuning stage3? #102

Open linqinguang opened 4 months ago

linqinguang commented 4 months ago

First, I modified the scripts/llama/train/stage_1_2_full_v7b_336_hr_768.sh, changing parameter ”--model_name_or_path” to the stage2 checkpoint “MGM-7B”, and then got loar model_path. Afterwards, using scripts/merge_lora_weights.py to merge the base and lora. But I found that it does not work. Compared to LLava, the mgm.model.builder.load_pretrained_model method seems to be lacking several components, the method don't load peftmodel.

yanwei-li commented 4 months ago

Hi, please refer to the issue #49 for the continuous fine-tuning. And all our models are fully finetunded. Actually, I did not try LoRA. This could be checked and supported soon.