modelscope / ms-swift

Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
https://swift.readthedocs.io/zh-cn/latest/Instruction/index.html
Apache License 2.0
3.66k stars 313 forks source link

在训练GLM4v时遇见报错KeyError: 'images' #1372

Closed ZhiyuYUE closed 2 months ago

ZhiyuYUE commented 2 months ago

Describe the bug 我依照文档的介绍构建数据集和运行脚本,运行脚本如下: NPROC_PER_NODE=4 \ CUDA_VISIBLE_DEVICES=0,1,2,3 swift sft \ --model_type glm4v-9b-chat \ --model_id_or_path /mnt/model/glm/glm-4-9b-chat\ --dataset /mnt/code/yzy/data/train_data.jsonl \ --ddp_find_unused_parameters true \ --output_dir /mnt/model/glm/glm4v_ft

我的数据集构建是 {"query": "xx", "response": "1.存在", "images": ["/mnt/code/yzy/data/train/40_1.jpg"]} {"query": "yy", "response": "1.存在", "images": ["/mnt/code/yzy/data/train/44_1.jpg"]}

在运行时有报错: rank0: Traceback (most recent call last): rank0: File "/mnt/code/yzy/swift/swift/cli/sft.py", line 5, in

rank0: File "/mnt/code/yzy/swift/swift/utils/run_utils.py", line 27, in x_main rank0: result = llm_x(args, **kwargs) rank0: File "/mnt/code/yzy/swift/swift/llm/sft.py", line 251, in llm_sft rank0: td0, tkwargs0 = template.encode(train_dataset[0]) rank0: File "/mnt/code/yzy/swift/swift/llm/utils/template.py", line 1049, in encode rank0: inputs['images'] = inputs2'images': File "/root/miniforge-pypy3/envs/swift/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 253, in getitem rank0: return self.data[item]

我使用json格式进行输入也是一样的报错

请问如何解决

tastelikefeet commented 2 months ago

不复现啊,能否用main分支重新试一下呢

ZhiyuYUE commented 2 months ago

不复现啊,能否用main分支重新试一下呢

好的我再试一下

tastelikefeet commented 2 months ago

不复现啊,能否用main分支重新试一下呢

好的我再试一下

请问你那里又复现或找到解决办法吗,有其他用户遇到了这个问题,但我这里仍然不复现

Jintao-Huang commented 2 months ago

使用ms-swift的最新版本(>=2.2.3)或者使用main分支 并更新glm4v最新的modeling_chatglm.py文件.