Closed YFCYFC closed 6 months ago
Hi: I use the LoRA to finetune my own models, but the results drop dramatically compared with the full finetuning conterparts.Here is the result table:
finetune-type MME GQA MMBench MM-Vet POPE SQA-image TextVQA full 1285.89 59.55 53.0 22.6 86.83 58.85 48.24 lora 1070.58 51.47 28.7 82.73 50.02 39.19 and here is my lora finetune bash :
LLM_VERSION="/mnt/pfs-guan-ssai/cv/ssai_mm_pt/models/TinyLlama-1.1B-Chat-v1.0" VT_VERSION="/mnt/pfs-guan-ssai/cv/ssai_mm_pt/models/TinyLLaVA-1.5B-SigLIP" DATA_PATH=/mnt/pfs-guan-ssai/cv/yangfucai/code/internlm/data/llava_data/LLaVA-Instruct-150K/llava_v1_5_mix665k.json IMAGE_PATH=/mnt/pfs-guan-ssai/cv/yangfucai/code/internlm/data/llava_data/llava_images VT_VARIANT="${VT_VERSION#*/}" LLM_VARIANT="${LLM_VERSION#*/}" deepspeed --include localhost:7 \ tinyllava/train/train.py \ --deepspeed ./scripts/tiny_llava/zero3.json \ --model_name_or_path $LLM_VERSION \ --lora_enable True --lora_r 32 --lora_alpha 64 \ --version v1 \ --data_path $DATA_PATH \ --image_folder $IMAGE_PATH\ --vision_tower $VT_VERSION \ --tune_entire_model False \ --tune_vit_from_layer 12 \ --mm_projector_type LDPv2 \ --mm_vision_select_layer -2 \ --mm_use_im_start_end False \ --mm_use_im_patch_token False \ --image_aspect_ratio pad \ --group_by_modality_length True \ --fp16 True \ --pretrain_mm_mlp_adapter ./checkpoints/tiny-llava-base-${LLM_VARIANT}-${VT_VARIANT}-ldpv2-llm-proj-pretrain/mm_projector.bin \ --output_dir ./checkpoints/tiny-llava-base-${LLM_VARIANT}-${VT_VARIANT}-finetune-llm-ldpv2-lora \ --num_train_epochs 2 \ --per_device_train_batch_size 8 \ --per_device_eval_batch_size 4 \ --gradient_accumulation_steps 4 \ --evaluation_strategy "no" \ --save_strategy "steps" \ --save_steps 50000 \ --save_total_limit 1 \ --learning_rate 2e-5 \ --weight_decay 0. \ --warmup_ratio 0.03 \ --lr_scheduler_type "cosine" \ --logging_steps 1 \ --tf32 False \ --model_max_length 3072 \ --gradient_checkpointing True \ --dataloader_num_workers 32 \ --lazy_preprocess True \ --report_to none \ --run_name tiny-llava-base-finetune-${LLM_VARIANT}-${VT_VARIANT}
I want to know where the errors occur.
And I noticed that --model_name_or_path of line 11 in scripts/tiny_llava/finetune/finetune_lora.sh is set as a finetuned model, am I right?But why would we do this now that we have got a finetuned model?
Hi, for those who would encounter this problem, I have to clarify that the lora settings(lora_r and lora_alpha) are vital.I tried lora_r choices:128, 256, 512(while lora_alpha is 2 x lora_r, which is 256, 512, 1024, respectively), the result is much more convincing. But 128/256/512 is too big for the tinyllama obviously, I am confused about the "low rank" property of linear layer in the LLMs.Feel free to reopen this issue if you meet any relevant problems or have new insights.
and here is my lora finetune bash :
I want to know where the errors occur.