Closed soham-joshi closed 7 months ago
Okay thank you for your response. @baichuanzhou
I tried fine-tuning (without LLoRA) TinyLLaVA-1.5B on my custom dataset with the instructions in this script. However, while evaluating the same fine-tuned model, I see that the model repository hasn't saved 'mm_projector.bin' weights and thus, I am unable to evaluate the fine-tuned model. Could you help me here, please?
@baichuanzhou @huangleiBuaa @eltociear @jiajunlong
Thanks!
Can I see your training script?
Sure,
`### Extracted from https://github.com/DLCV-BUAA/TinyLLaVABench/blob/dev/scripts/tiny_llava/finetune/finetune_lora.sh
DATA_PATH="data/tvqa-instruct-cleaned-12k-sentence-answer.json" IMAGE_PATH="data/images/" OUTPUT_DIR="TinyLLaVA_logs/TinyLLaVA-1.5B-full_ft_TextCaps/"
deepspeed tinyllava/train/train.py \ --deepspeed ./scripts/tiny_llava/zero3.json \ --model_name_or_path bczhou/TinyLLaVA-1.5B \ --version v1 \ --data_path $DATA_PATH \ --image_folder $IMAGE_PATH \ --vision_tower bczhou/TinyLLaVA-1.5B-SigLIP \ --mm_projector_type mlp2x_gelu \ --mm_vision_select_layer -2 \ --mm_use_im_start_end False \ --mm_use_im_patch_token False \ --image_aspect_ratio pad \ --group_by_modality_length False \ --fp16 True \ --output_dir $OUTPUT_DIR \ --num_train_epochs 5 \ --per_device_train_batch_size 8 \ --per_device_eval_batch_size 4 \ --gradient_accumulation_steps 2 \ --evaluation_strategy "no" \ --save_strategy "steps" \ --save_steps 5000 \ --save_total_limit 1 \ --learning_rate 2e-5 \ --weight_decay 0. \ --warmup_ratio 0.03 \ --lr_scheduler_type "cosine" \ --logging_steps 1 \ --tf32 False \ --model_max_length 3072 \ --gradient_checkpointing True \ --dataloader_num_workers 15 \ --lazy_preprocess True \ --report_to wandb `
Let me know if there's anything which I missed while fine-tuning. Thanks!
By setting tune_mm_mlp_adapter
to True
will you be able to only tune the mlp adapter, which will allow you to save the mm_projector.bin
. Otherwise, you will tune both the adapter and the LLM.
Okay, understood @baichuanzhou .
For evaluating, I am using the following script, which expects the mm_projector.bin file.
Getting this error.
my Eval script:
#!/bin/bash
MODEL_PATH="/mnt/nasfolder/imt2018072/LLMs/TinyLLaVA_logs/TinyLLaVA-1.5B-full_ft_TextCaps/"
MODEL_NAME="TinyLLaVA-1.5B-full_ft_TextCaps"
EVAL_DIR="./playground/data/eval"
python -m tinyllava.eval.model_vqa_loader \
--model-path $MODEL_PATH \
--question-file $EVAL_DIR/textvqa/llava_textvqa_val_v051_ocr.jsonl \
--image-folder $EVAL_DIR/textvqa/train_images \
--answers-file $EVAL_DIR/textvqa/answers/$MODEL_NAME.jsonl \
--temperature 0 \
--model-base "bczhou/TinyLLaVA-1.5B" \
--conv-mode v1
python -m tinyllava.eval.eval_textvqa \
--annotation-file $EVAL_DIR/textvqa/TextVQA_0.5.1_val.json \
--result-file $EVAL_DIR/textvqa/answers/$MODEL_NAME.jsonl
I think it is because I am giving an argument --model-base and --conv-mode arguments.
If you give --model-base
as an argument, and your name does not contain 'LoRA' in it, the load_pretrained_model
function in builder.py
will look for mm_projector.bin
. Since I assume you did not pass --tune_mm_mlp_adapter
as an argument during finetuning, you do not need to pass model_base
during evaluation.
Yes, got it, thank you for the clarification and the prompt responses! Closing this issue. @baichuanzhou
I want to LLoRA fine-tune TinyLLaVA-1.5B on a custom text-VQA dataset. Could you help me with:
Could you please clarify which script shall be used for Fine-tuning TinyLLaVA-1.5B using LLoRA? @baichuanzhou @huangleiBuaa @eltociear @jiajunlong
Thanks!