modelscope / ms-swift

Use PEFT or Full-parameter to finetune 350+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
https://swift.readthedocs.io/zh-cn/latest/Instruction/index.html
Apache License 2.0
3.8k stars 325 forks source link

qwen2-vl-72b lora微调 ,不支持纯文本指令和图片指令数据混合训练? #2198

Open Luccadoremi opened 6 days ago

Luccadoremi commented 6 days ago

纯文本指令和图片指令数据混合训练会卡死,请问后续是否会支持?

训练脚本如下 ######################################

SIZE_FACTOR=6 MAX_PIXELS=602112 \ NPROC_PER_NODE=6 CUDA_VISIBLE_DEVICES=0,3,4,5,6,7 \ swift sft \ --model_type qwen2-vl-72b-instruct \ --model_id_or_path ./hf/qwen2-vl-72b-instruct \ --output_dir output/qwen2-vl-72b-instruct-$train_type \ --sft_type lora \ --custom_dataset_info mydata.json \ --dataset $data \ --max_length $maxl \ --dtype fp16 \ --batch_size 1 \ --check_dataset_strategy warning \ --use_flash_attn True \ --truncation_strategy delete \ --deepspeed default-zero3 \

Jintao-Huang commented 4 days ago

please use main branch

Wu0409 commented 3 days ago

我也遇到了同样的问题,请问下您解决了吗?

Wu0409 commented 3 days ago

please use main branch

使用 2.4.2 版本,该问题依旧存在