bdytx5 / finetune_LLaVA

Apache License 2.0
22 stars 7 forks source link

[Usage] missing file - ./scripts/zero2.json #1

Open mrseanryan opened 6 months ago

mrseanryan commented 6 months ago

Describe the issue

The very nice article has this script:


# Set the prompt and model versions directly in the command
deepspeed /root/LLaVA/llava/train/ \
    --deepspeed /root/LLaVA/scripts/zero2.json \
    --lora_enable True \
    --lora_r 128 \
    --lora_alpha 256 \
    --mm_projector_lr 2e-5 \
    --bits 4 \
    --model_name_or_path /root/LLaVA/llava/llava-v1.5-7b \
    --version llava_llama_2 \
    --data_path /root/dataset/train/dataset.json \
    --validation_data_path /root/dataset/validation/dataset.json \
    --image_folder /root/dataset/images/ \
    --vision_tower openai/clip-vit-large-patch14-336 \
    --mm_projector_type mlp2x_gelu \
    --mm_vision_select_layer -2 \
    --mm_use_im_start_end False \
    --mm_use_im_patch_token False \
    --image_aspect_ratio pad \
    --group_by_modality_length True \
    --bf16 True \
    --output_dir /root/LLaVA/llava/checkpoints/llama-2-7b-chat-task-qlora \
    --num_train_epochs 500 \
    --per_device_train_batch_size 32 \
    --per_device_eval_batch_size 32 \
    --gradient_accumulation_steps 1 \
    --evaluation_strategy “epoch” \ 
    --save_strategy "steps" \
    --save_steps 50000 \
    --save_total_limit 1 \
    --learning_rate 2e-4 \
    --weight_decay 0. \
    --warmup_ratio 0.03 \
    --lr_scheduler_type "cosine" \
    --logging_steps 1 \
    --tf32 True \
    --model_max_length 2048 \
    --gradient_checkpointing True \
    --dataloader_num_workers 4 \
    --lazy_preprocess True \
    --report_to wandb

but the file it refers to /root/LLaVA/scripts/zero2.json is not in this repo.

The file is probably this one:

should be at ./scripts/zero2.json ?

mrseanryan commented 6 months ago

Related - inspired by your article on Weights and Biases, I put together this fork that tries to include all steps and scripts to fine-tune v1.5 of LLaVA:

mrseanryan commented 6 months ago

It could be interesting to see how to fine-tune v1.6 ...

anas-zafar commented 2 months ago

Hi @mrseanryan , I am having an issue when I try to run the merge_lora_weights script.

!python /content/LLaVA/scripts/ --model-path /content/drive/MyDrive/llava_output_final_v1/adapter_model.safetensors --model-base liuhaotian/llava-v1.5-7b --save-model-path /content/drive/MyDrive/llava_output_config/output/merged_model

Traceback (most recent call last): File "/content/LLaVA/scripts/", line 22, in merge_lora(args) File "/content/LLaVA/scripts/", line 8, in merge_lora tokenizer, model, image_processor, context_len = load_pretrained_model(args.model_path, args.model_base, model_name, device_map='cpu') File "/content/LLaVA/llava/model/", line 128, in load_pretrained_model model = AutoModelForCausalLM.from_pretrained(model_base, low_cpu_mem_usage=True, **kwargs) File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/", line 569, in from_pretrained raise ValueError( ValueError: Unrecognized configuration class <class 'transformers.models.llava.configuration_llava.LlavaConfig'> for this kind of AutoModel: AutoModelForCausalLM. Model type should be one of BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig

Could you guide me how to fix this please? Thanks