Intern-VL多图任务微调以及推理方式

modelscope / swift

ms-swift: Use PEFT or Full-parameter to finetune 250+ LLMs or 35+ MLLMs. (Qwen2, GLM4, Internlm2, Yi, Llama3, Llava, MiniCPM-V, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Apache License 2.0

2.13k stars 205 forks source link

对多图我们也是这么处理的 https://github.com/modelscope/swift/blob/main/swift/llm/utils/template.py#L1231-L1236

感谢您的回复，我这边在微调多图任务时报了这个错误我的数据格式是这样的，images对应一个列表，该列表中包括两张图像：

"images": [ "/mnt/data/code/banqun.sz/intern-vl/SFT/max0619_is_syn/cspuurl/https:ççimg.alicdn.comçimgextraçi4ç6000000006629çO1CN01Amf9Ro1yq8TjEGrT5!!6000000006629-0-alihealth_ic.jpg", "/mnt/data/code/banqun.sz/intern-vl/SFT/max0619_is_syn/skudetection/https:ççimg.alicdn.comçimgextraçi2ç2113790279çTB28plhX3JkpuFjSszcXXXfsFXa!!2113790279.jpg/split_0.jpeg" ], 这样写是否符合要求呢？还是说需要把多张图写在一个字符串元素里面

modelscope / swift

Intern-VL多图任务微调以及推理方式 #1235