modelscope / ms-swift

Use PEFT or Full-parameter to finetune 350+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
https://swift.readthedocs.io/zh-cn/latest/Instruction/index.html
Apache License 2.0
4.01k stars 355 forks source link

DDP情况下, 报错subprocess.CalledProcessError: Command '['git', '-C', '/root/.cache/modelscope/hub/_github', 'clone', 'https://github.com/haotian-liu/LLaVA.git', 'LLaVA.git']' returned non-zero exit status 128. #787

Closed zhangfan-algo closed 6 months ago

zhangfan-algo commented 6 months ago

ValueError: model_type: 'llava1d6-yi-34b-instruct' is not registered. The model_type you can choose: ['mengzi3-13b-base', 'baichuan-7b', 'baichuan-13b-chat', 'xverse-moe-a4_2b', 'xverse-7b', 'xverse-7b-chat', 'xverse-13b-256k', 'xverse-65b-chat', 'xverse-65b-v2', 'xverse-65b', 'xverse-13b', 'xverse-13b-chat', 'seqgpt-560m', 'bluelm-7b', 'bluelm-7b-32k', 'bluelm-7b-chat', 'bluelm-7b-chat-32k', 'internlm-7b', 'internlm-20b', 'grok-1', 'mamba-2.8b', 'mamba-1.4b', 'mamba-790m', 'mamba-390m', 'mamba-370m', 'mamba-130m', 'cogagent-18b-instruct', 'cogagent-18b-chat', 'cogvlm-17b-instruct', 'internlm-7b-chat', 'internlm-7b-chat-8k', 'internlm-20b-chat', 'baichuan-13b', 'baichuan2-13b', 'baichuan2-13b-chat', 'baichuan2-7b', 'baichuan2-7b-chat', 'baichuan2-7b-chat-int4', 'baichuan2-13b-chat-int4', 'codegeex2-6b', 'chatglm2-6b', 'chatglm2-6b-32k', 'chatglm3-6b-base', 'chatglm3-6b', 'chatglm3-6b-32k', 'codefuse-codegeex2-6b-chat', 'dbrx-instruct', 'dbrx-base', 'mixtral-moe-7b-instruct', 'mixtral-moe-7b', 'mistral-7b-v2', 'mistral-7b', 'mistral-7b-instruct-v2', 'mistral-7b-instruct', 'openbuddy-llama2-13b-chat', 'openbuddy-llama-65b-chat', 'openbuddy-llama2-70b-chat', 'openbuddy-mistral-7b-chat', 'openbuddy-mixtral-moe-7b-chat', 'ziya2-13b', 'ziya2-13b-chat', 'yi-6b', 'yi-9b', 'yi-6b-200k', 'yi-34b', 'yi-34b-200k', 'yi-34b-chat', 'yi-6b-chat', 'zephyr-7b-beta-chat', 'openbuddy-zephyr-7b-chat', 'sus-34b-chat', 'deepseek-7b', 'deepseek-7b-chat', 'deepseek-67b', 'deepseek-67b-chat', 'openbuddy-deepseek-67b-chat', 'deepseek-coder-33b-instruct', 'deepseek-coder-6_7b-instruct', 'deepseek-coder-1_3b-instruct', 'deepseek-coder-33b', 'deepseek-coder-6_7b', 'deepseek-coder-1_3b', 'qwen1half-moe-a2_7b', 'qwen1half-72b', 'qwen1half-32b', 'qwen1half-14b', 'qwen1half-7b', 'qwen1half-4b', 'qwen1half-1_8b', 'qwen1half-0_5b', 'deepseek-math-7b', 'deepseek-math-7b-chat', 'deepseek-math-7b-instruct', 'gemma-7b-instruct', 'gemma-2b-instruct', 'gemma-7b', 'gemma-2b', 'mixtral-moe-7b-aqlm-2bit-1x16', 'llama2-7b-aqlm-2bit-1x16', 'qwen1half-moe-a2_7b-chat', 'qwen1half-72b-chat', 'qwen1half-32b-chat', 'qwen1half-14b-chat', 'qwen1half-7b-chat', 'qwen1half-4b-chat', 'qwen1half-1_8b-chat', 'qwen1half-0_5b-chat', 'qwen1half-72b-chat-awq', 'qwen1half-14b-chat-awq', 'qwen1half-7b-chat-awq', 'qwen1half-4b-chat-awq', 'qwen1half-1_8b-chat-awq', 'qwen1half-0_5b-chat-awq', 'qwen1half-moe-a2_7b-chat-int4', 'qwen1half-72b-chat-int8', 'qwen1half-72b-chat-int4', 'qwen1half-32b-chat-int4', 'qwen1half-14b-chat-int8', 'qwen1half-14b-chat-int4', 'qwen1half-7b-chat-int8', 'qwen1half-7b-chat-int4', 'qwen1half-4b-chat-int8', 'qwen1half-4b-chat-int4', 'qwen1half-1_8b-chat-int8', 'qwen1half-1_8b-chat-int4', 'qwen1half-0_5b-chat-int8', 'qwen1half-0_5b-chat-int4', 'internlm2-20b-base', 'internlm2-20b', 'internlm2-7b-base', 'internlm2-7b', 'internlm2-20b-chat', 'internlm2-20b-sft-chat', 'internlm2-7b-chat', 'internlm2-7b-sft-chat', 'internlm2-math-20b-chat', 'internlm2-math-7b-chat', 'internlm2-math-20b', 'internlm2-math-7b', 'internlm2-1_8b-chat', 'internlm2-1_8b-sft-chat', 'internlm2-1_8b', 'internlm-xcomposer2-7b-chat', 'deepseek-vl-1_3b-chat', 'deepseek-vl-7b-chat', 'llama2-70b-chat', 'llama2-13b-chat', 'llama2-7b-chat', 'llama2-70b', 'llama2-13b', 'llama2-7b', 'polylm-13b', 'qwen-7b', 'qwen-14b', 'tongyi-finance-14b', 'qwen-72b', 'qwen-1_8b', 'codefuse-qwen-14b-chat', 'qwen-7b-chat', 'qwen-14b-chat', 'tongyi-finance-14b-chat', 'qwen-72b-chat', 'qwen-1_8b-chat', 'qwen-vl', 'qwen-vl-chat', 'qwen-audio', 'qwen-audio-chat', 'qwen-7b-chat-int4', 'qwen-14b-chat-int4', 'qwen-7b-chat-int8', 'qwen-14b-chat-int8', 'qwen-vl-chat-int4', 'tongyi-finance-14b-chat-int4', 'qwen-72b-chat-int4', 'qwen-72b-chat-int8', 'qwen-1_8b-chat-int4', 'qwen-1_8b-chat-int8', 'skywork-13b', 'skywork-13b-chat', 'codefuse-codellama-34b-chat', 'telechat-12b', 'phi2-3b', 'telechat-7b', 'deepseek-moe-16b', 'deepseek-moe-16b-chat', 'yuan2-2b-janus-instruct', 'yuan2-102b-instruct', 'yuan2-51b-instruct', 'yuan2-2b-instruct', 'orion-14b-chat', 'orion-14b', 'yi-vl-6b-chat', 'yi-vl-34b-chat', 'minicpm-2b-chat', 'minicpm-2b-sft-chat', 'minicpm-v-3b-chat', 'llava1d6-mistral-7b-instruct', 'tigerbot-13b-chat', 'tigerbot-13b', 'tigerbot-7b']

Jintao-Huang commented 6 months ago

使用main分支或者ms-swift==2.0.3

zhangfan-algo commented 6 months ago

[INFO:swift] Run the command: git -C '/root/.cache/modelscope/hub/_github' clone 'https://github.com/haotian-liu/LLaVA.git' LLaVA.git Cloning into 'LLaVA.git'... fatal: could not create work tree dir 'LLaVA.git': File exists fatal: could not create work tree dir 'LLaVA.git': File exists fatal: could not create work tree dir 'LLaVA.git': File exists fatal: could not create work tree dir 'LLaVA.git': File exists fatal: could not create work tree dir 'LLaVA.git': File exists fatal: could not create work tree dir 'LLaVA.git': File exists fatal: could not create work tree dir 'LLaVA.git': File exists Traceback (most recent call last): File "/mnt/pfs/zhangfan/homework_correction/swift_0423/examples/pytorch/llm/llm_sft.py", line 7, in Traceback (most recent call last): File "/mnt/pfs/zhangfan/homework_correction/swift_0423/examples/pytorch/llm/llm_sft.py", line 7, in Traceback (most recent call last): File "/mnt/pfs/zhangfan/homework_correction/swift_0423/examples/pytorch/llm/llm_sft.py", line 7, in Traceback (most recent call last): File "/mnt/pfs/zhangfan/homework_correction/swift_0423/examples/pytorch/llm/llm_sft.py", line 7, in Traceback (most recent call last): Traceback (most recent call last): File "/mnt/pfs/zhangfan/homework_correction/swift_0423/examples/pytorch/llm/llm_sft.py", line 7, in File "/mnt/pfs/zhangfan/homework_correction/swift_0423/examples/pytorch/llm/llm_sft.py", line 7, in Traceback (most recent call last): File "/mnt/pfs/zhangfan/homework_correction/swift_0423/examples/pytorch/llm/llm_sft.py", line 7, in output = sft_main()
output = sft_main() File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/utils/run_utils.py", line 31, in x_main File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/utils/run_utils.py", line 31, in x_main output = sft_main() File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/utils/run_utils.py", line 31, in x_main output = sft_main()result = llm_x(args, **kwargs)output = sft_main()

  File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/sft.py", line 75, in llm_sft

result = llm_x(args, **kwargs) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/utils/run_utils.py", line 31, in x_main File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/utils/run_utils.py", line 31, in x_main

File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/sft.py", line 75, in llm_sft result = llm_x(args, kwargs) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/sft.py", line 75, in llm_sft model, tokenizer = get_model_tokenizer( File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 3889, in get_model_tokenizer model, tokenizer = get_model_tokenizer( File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 3889, in get_model_tokenizer model, tokenizer = get_model_tokenizer(result = llm_x(args, kwargs)

File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 3889, in get_model_tokenizer File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/sft.py", line 75, in llm_sft result = llm_x(args, kwargs) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/sft.py", line 75, in llm_sft output = sft_main() File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/utils/run_utils.py", line 31, in x_main model, tokenizer = get_model_tokenizer( File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 3889, in get_model_tokenizer model, tokenizer = get_model_tokenizer( File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 3889, in get_model_tokenizer result = llm_x(args, kwargs) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/sft.py", line 75, in llm_sft model, tokenizer = get_model_tokenizer( File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 3889, in get_model_tokenizer model, tokenizer = get_function(model_dir, torch_dtype, model_kwargs, File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 3633, in get_model_tokenizer_llava model, tokenizer = get_function(model_dir, torch_dtype, model_kwargs,model, tokenizer = get_function(model_dir, torch_dtype, model_kwargs,

File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 3633, in get_model_tokenizer_llava File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 3633, in get_model_tokenizer_llava model, tokenizer = get_function(model_dir, torch_dtype, model_kwargs, File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 3633, in get_model_tokenizer_llava model, tokenizer = get_function(model_dir, torch_dtype, model_kwargs, File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 3633, in get_model_tokenizer_llava model, tokenizer = get_function(model_dir, torch_dtype, model_kwargs, File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 3633, in get_model_tokenizer_llava output = sft_main() local_repo_path = _git_clone_github(local_repo_path = _git_clone_github(

File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 2397, in _git_clone_github File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 2397, in _git_clone_github File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/utils/run_utils.py", line 31, in x_main local_repo_path = _git_clone_github(
local_repo_path = _git_clone_github( File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 2397, in _git_clone_github File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 2397, in _git_clone_github local_repo_path = _git_clone_github( File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 2397, in _git_clone_github result = llm_x(args, **kwargs)local_repo_path = _git_clone_github(

File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 2397, in _git_clone_github File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/sft.py", line 75, in llm_sft model, tokenizer = get_model_tokenizer( File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 3889, in get_model_tokenizer subprocess_run(command) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/utils/utils.py", line 178, in subprocess_run subprocess_run(command) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/utils/utils.py", line 178, in subprocess_run subprocess_run(command) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/utils/utils.py", line 178, in subprocess_run subprocess_run(command)resp.check_returncode()

  File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/utils/utils.py", line 178, in subprocess_run

resp.check_returncode() File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/subprocess.py", line 457, in check_returncode subprocess_run(command) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/subprocess.py", line 457, in check_returncode resp.check_returncode()

File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/utils/utils.py", line 178, in subprocess_run File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/subprocess.py", line 457, in check_returncode resp.check_returncode() File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/subprocess.py", line 457, in check_returncode subprocess_run(command) resp.check_returncode() File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/utils/utils.py", line 178, in subprocess_run File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/subprocess.py", line 457, in check_returncode raise CalledProcessError(self.returncode, self.args, self.stdout, raise CalledProcessError(self.returncode, self.args, self.stdout, subprocess .resp.check_returncode()CalledProcessErrorsubprocess .CalledProcessError File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/subprocess.py", line 457, in check_returncode raise CalledProcessError(self.returncode, self.args, self.stdout,: : Command '['git', '-C', '/root/.cache/modelscope/hub/_github', 'clone', 'https://github.com/haotian-liu/LLaVA.git', 'LLaVA.git']' returned non-zero exit status 128. Command '['git', '-C', '/root/.cache/modelscope/hub/_github', 'clone', 'https://github.com/haotian-liu/LLaVA.git', 'LLaVA.git']' returned non-zero exit status 128.

subprocess .raise CalledProcessError(self.returncode, self.args, self.stdout,CalledProcessError raise CalledProcessError(self.returncode, self.args, self.stdout, subprocess: .Command '['git', '-C', '/root/.cache/modelscope/hub/_github', 'clone', 'https://github.com/haotian-liu/LLaVA.git', 'LLaVA.git']' returned non-zero exit status 128.CalledProcessError subprocess.CalledProcessError: Command '['git', '-C', '/root/.cache/modelscope/hub/_github', 'clone', 'https://github.com/haotian-liu/LLaVA.git', 'LLaVA.git']' returned non-zero exit status 128. raise CalledProcessError(self.returncode, self.args, self.stdout, : Command '['git', '-C', '/root/.cache/modelscope/hub/_github', 'clone', 'https://github.com/haotian-liu/LLaVA.git', 'LLaVA.git']' returned non-zero exit status 128. subprocess.CalledProcessError: Command '['git', '-C', '/root/.cache/modelscope/hub/_github', 'clone', 'https://github.com/haotian-liu/LLaVA.git', 'LLaVA.git']' returned non-zero exit status 128. model, tokenizer = get_function(model_dir, torch_dtype, model_kwargs, File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 3633, in get_model_tokenizer_llava local_repo_path = _git_clone_github( File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 2397, in _git_clone_github subprocess_run(command) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/utils/utils.py", line 178, in subprocess_run resp.check_returncode() File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/subprocess.py", line 457, in check_returncode raise CalledProcessError(self.returncode, self.args, self.stdout, subprocess.CalledProcessError: Command '['git', '-C', '/root/.cache/modelscope/hub/_github', 'clone', 'https://github.com/haotian-liu/LLaVA.git', 'LLaVA.git']' returned non-zero exit status 128.

zhangfan-algo commented 6 months ago

又有了新的报错

Jintao-Huang commented 6 months ago

好了我知道了... git clone那里没有设置分布式兼容

Jintao-Huang commented 6 months ago

git -C '/root/.cache/modelscope/hub/_github' clone 'https://github.com/haotian-liu/LLaVA.git' LLaVA.git

Jintao-Huang commented 6 months ago

手动跑一下这个命令 然后再跑sh就可以了

Jintao-Huang commented 6 months ago

fixed了,main分支

zhangfan-algo commented 6 months ago

嗯呢 可以了 目前又有了新报错

Jintao-Huang commented 6 months ago

什么报错哇

zhangfan-algo commented 6 months ago

Traceback (most recent call last): File "/mnt/pfs/zhangfan/homework_correction/swift_0424/examples/pytorch/llm/llm_sft.py", line 7, in output = sft_main() File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/utils/run_utils.py", line 31, in x_main result = llm_x(args, kwargs) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/sft.py", line 261, in llm_sft trainer.train(training_args.resume_from_checkpoint) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/trainers/trainers.py", line 54, in train res = super().train(args, kwargs) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/transformers/trainer.py", line 1624, in train return inner_training_loop( File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/transformers/trainer.py", line 1961, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/transformers/trainer.py", line 2902, in training_step loss = self.compute_loss(model, inputs) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/trainers/trainers.py", line 220, in compute_loss outputs = model(inputs) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, kwargs) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn ret_val = func(*args, *kwargs) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1855, in forward loss = self.module(inputs, kwargs) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, *kwargs) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/tuners/base.py", line 84, in forward return self.base_model(args, kwargs) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, *kwargs) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/swift/llm/utils/model.py", line 3652, in _new_forward return forward(args, kwargs) File "/root/.cache/modelscope/hub/_github/LLaVA.git/llava/model/language_model/llava_llama.py", line 81, in forward ) = self.prepare_inputs_labels_for_multimodal( File "/root/.cache/modelscope/hub/_github/LLaVA.git/llava/model/llava_arch.py", line 251, in prepare_inputs_labels_for_multimodal cur_input_embeds = self.get_model().embed_tokens(torch.cat(cur_input_ids_noim)) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1574, in _call_impl hook_result = hook(self, args, result) File "/apps1/zhangfan/anaconda3/envs/swift/lib/python3.10/site-packages/transformers/trainer_utils.py", line 128, in neftune_post_forward_hook dims = torch.tensor(output.size(1) output.size(2)) IndexError: Dimension out of range (expected to be in range of [-2, 1], but got 2)

zhangfan-algo commented 6 months ago

相同数据微调internlm-xcomposer2-4khd-7b 也有报错 image

zhangfan-algo commented 6 months ago

数据格式采用的是下面的格式 [ {"conversations": [ {"from": "user", "value": "img_path11111"}, {"from": "assistant", "value": "22222"} ]}, {"conversations": [ {"from": "user", "value": "img_pathimg_path2img_path3aaaaa"}, {"from": "assistant", "value": "bbbbb"}, {"from": "user", "value": "img_pathccccc"}, {"from": "assistant", "value": "ddddd"} ]}, {"conversations": [ {"from": "user", "value": "AAAAA"}, {"from": "assistant", "value": "BBBBB"}, {"from": "user", "value": "CCCCC"}, {"from": "assistant", "value": "DDDDD"} ]} ]

Jintao-Huang commented 6 months ago

我这里测试是正常的,最佳实践文档中的自定义数据集格式

AlexJJJChen commented 6 months ago

我这里测试是正常的,最佳实践文档中的自定义数据集格式

字典之间需要加逗号隔开吗?

AlexJJJChen commented 6 months ago

我这里测试是正常的,最佳实践文档中的自定义数据集格式

不加逗号隔开会出现下面的问题 json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 213)

Jintao-Huang commented 6 months ago

jsonl不需要加逗号