mbzuai-oryx / GeoChat

[CVPR 2024 🔥] GeoChat, the first grounded Large Vision Language Model for Remote Sensing
https://mbzuai-oryx.github.io/GeoChat
356 stars 23 forks source link

how to run the lora finetuned model? #35

Open dongwhfdyer opened 2 months ago

dongwhfdyer commented 2 months ago

I have followed the instructions of finetune_lora.sh and got the trained model.

this is my finetune_lora.sh

#!/bin/bash

################## VICUNA ##################
PROMPT_VERSION=v1
MODEL_VERSION="vicuna-v1.5-7b"
gpu_ids=0,1,2,3
################## VICUNA ##################

 deepspeed --master_port=$((RANDOM + 10000)) --include localhost:$gpu_ids geochat/train/train_mem.py \
    --deepspeed ./scripts/zero2.json \
    --lora_enable True \
    --model_name_or_path pretrained_weights/llavav1.5-7b \
    --version $PROMPT_VERSION \
    --data_path ~/datasets/GeoChat_Instruct.json \
    --image_folder ~/datasets/GeoChat_finetuning/final_images_llava  \
    --vision_tower openai/clip-vit-large-patch14-336 \
    --mm_projector_type mlp2x_gelu \
    --pretrain_mm_mlp_adapter pretrained_weights/llava-v1.5-mlp2x-336px-pretrain-vicuna-7b-v1.5/mm_projector.bin \
    --mm_vision_select_layer -2 \
    --mm_use_im_start_end False \
    --mm_use_im_patch_token False \
    --image_aspect_ratio pad \
    --bf16 True \
    --output_dir /nfs/geochat_output/checkpoints_dir \
    --num_train_epochs 1 \
    --per_device_train_batch_size 6 \
    --per_device_eval_batch_size 4 \
    --gradient_accumulation_steps 4 \
    --evaluation_strategy "no" \
    --save_strategy "epoch" \
    --save_steps 1000 \
    --save_total_limit 5 \
    --learning_rate 2e-4 \
    --weight_decay 0. \
    --warmup_ratio 0.03 \
    --lr_scheduler_type "cosine" \
    --logging_steps 1 \
    --tf32 True \
    --model_max_length 2048 \
    --gradient_checkpointing True \
    --lazy_preprocess True \
    --dataloader_num_workers 0 \
    --report_to wandb

here is the saved lora fine_tuned model.

(base) ➜  checkpoints_dir tree
.
├── adapter_config.json
├── adapter_model.bin
├── checkpoint-3217
│   ├── adapter_config.json
│   ├── adapter_model.bin
│   ├── global_step3217
│   │   ├── bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt
│   │   ├── bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt
│   │   ├── bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt
│   │   ├── bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt
│   │   └── mp_rank_00_model_states.pt
│   ├── latest
│   ├── README.md
│   ├── rng_state_0.pth
│   ├── rng_state_1.pth
│   ├── rng_state_2.pth
│   ├── rng_state_3.pth
│   ├── special_tokens_map.json
│   ├── tokenizer_config.json
│   ├── tokenizer.model
│   ├── trainer_state.json
│   ├── training_args.bin
│   └── zero_to_fp32.py
├── config.json
├── non_lora_trainables.bin
├── README.md
└── trainer_state.json

I don't know how to load this model, I didn't find it in readme.md. can anyone help me? Thank you!

dongwhfdyer commented 1 month ago

now i know it. look at the llava project. you would find the two-stage weight-loading methods. if anyone still don't know, contact me

lx709 commented 1 month ago

Thanks, @dongwhfdyer , I already figured it out.

kartikey9254 commented 2 weeks ago

now i know it. look at the llava project. you would find the two-stage weight-loading methods. if anyone still don't know, contact me

hi there , i am trying out this model and the demo worked but when i used the lora.sh script for training it displays OSError: Error no file named pytorch Model. bin, tf Model. h5, model. ckpt. index or flex_ Model. msgpack found in directory/home/LaVA/lava v1.5-13b lora . can you guide me how can i train this model ?

732259408 commented 1 week ago

@dongwhfdyer hi, In the finetune_lora.sh --pretrain_mm_mlp_adapter path/to/llava-v1.5-mlp2x-336px-pretrain-vicuna-7b-v1.5/mm_projector.bin, l have a issue. Is the mm_projector.bin file using weights from llava-v1.5-7b? I couldn't find mm_projector.bin in Geochat-7B.