微调internvl-v1.5报错KeyError: 'input_ids'

sunzx8 commented 5 months ago

Describe the bug What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程，最好有截图)

运行指令 CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 swift sft --model_type internvl-chat-v1_5 --model_id_or_path /dev/shm/shawn/hf_ms_model/InternVL-Chat-V1-5 --dataset /dev/shm/shawn/data/ftoy.jsonl --sft_type full

数据格式为 {"query": "输出图片内容的markdown内容，如果有表格，则输出为html格式", "response": "```markdown\nAdaptive Quotient Filters\n\nConference '17, July 2017, Washington, DC, USA\n\n[34] Russell Housley, Warwick Ford, William Polk, and David Solo. 1999. Internet X.509 public key infrastructure certificate and CRL profile. Technical Report. M. Frans Kaashoek. 2002. The case for application-specific protocols. In Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP).", "images": ["/dev/shm/shawn/data/input/2405.10253v1/2405.10253v1-p16.png"]}

Your hardware and system info Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息，如CUDA版本，系统，GPU型号和torch版本等) 8*L20

Additional context Add any other context about the problem here(在这里补充其他信息)

sunzx8 commented 5 months ago

我查了一下这个batch返回的是图片的两个元素，没有input_ids

请问这是什么原因？

hjh0119 commented 5 months ago

八卡device map可能会有问题，试下2/4卡

sunzx8 commented 5 months ago

您好，我这里测出来是max_length的问题，请问为什么我设置max_length从2048到4096过后就会报错 RuntimeError: CUDA error: unspecified launch failure CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

sunzx8 commented 5 months ago

还有我想问一下如果需要16卡两台机器一起微调需要怎么设置？

hjh0119 commented 5 months ago

CUDA报错，可能是OOM或者CUDA环境问题

多机多卡readme里有样例

sunzx8 commented 5 months ago

CUDA报错，可能是OOM或者CUDA环境问题

多机多卡readme里有样例

还有个问题，我发现用您给的lora微调方式虽然param显示只训练了很少的参数，但是显存消耗和全参数一模一样，请问这是不是实际没有转换过来？

实际消耗显存和全参数微调coco-mini的一样是241gb

sunzx8 commented 5 months ago

CUDA报错，可能是OOM或者CUDA环境问题多机多卡readme里有样例

还有个问题，我发现用您给的lora微调方式虽然param显示只训练了很少的参数，但是显存消耗和全参数一模一样，请问这是不是实际没有转换过来？实际消耗显存和全参数微调coco-mini的一样是241gb

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 swift sft --model_type internvl-chat-v1_5 --model_id_or_path /dev/shm/shawn/hf_ms_model/InternVL-Chat-V1-5 --dataset coco-mini-en-2 --sft_type lora

hjh0119 commented 4 months ago

八卡device map可能会有问题，试下2/4卡

modelscope / ms-swift

微调internvl-v1.5报错KeyError: 'input_ids' #951