Use PEFT or Full-parameter to finetune 350+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
3.8k
stars
325
forks
source link
qwen2-vl-72b lora微调 ,不支持纯文本指令和图片指令数据混合训练? #2198
Open
Luccadoremi opened 6 days ago
纯文本指令和图片指令数据混合训练会卡死,请问后续是否会支持?
训练脚本如下 ######################################
SIZE_FACTOR=6 MAX_PIXELS=602112 \ NPROC_PER_NODE=6 CUDA_VISIBLE_DEVICES=0,3,4,5,6,7 \ swift sft \ --model_type qwen2-vl-72b-instruct \ --model_id_or_path ./hf/qwen2-vl-72b-instruct \ --output_dir output/qwen2-vl-72b-instruct-$train_type \ --sft_type lora \ --custom_dataset_info mydata.json \ --dataset $data \ --max_length $maxl \ --dtype fp16 \ --batch_size 1 \ --check_dataset_strategy warning \ --use_flash_attn True \ --truncation_strategy delete \ --deepspeed default-zero3 \