modelscope / ms-swift

Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
https://swift.readthedocs.io/zh-cn/latest/Instruction/index.html
Apache License 2.0
4.21k stars 370 forks source link

微调seqgpt后,使用微调过后的ckpt作为resume_from_ckpt参数,报错 #107

Closed wickedsickk closed 1 year ago

wickedsickk commented 1 year ago

Traceback (most recent call last): File "src/llm_sft.py", line 242, in llm_sft(args) File "src/llm_sft.py", line 218, in llm_sft trainer.train(training_args.resume_from_checkpoint) File "/root/anaconda3/envs/swift/lib/python3.8/site-packages/transformers/trainer.py", line 1553, in train return inner_training_loop( File "/root/anaconda3/envs/swift/lib/python3.8/site-packages/transformers/trainer.py", line 1705, in _inner_training_loop self._load_optimizer_and_scheduler(resume_from_checkpoint) File "/root/anaconda3/envs/swift/lib/python3.8/site-packages/transformers/trainer.py", line 2496, in _load_optimizer_and_scheduler self.optimizer.load_state_dict( File "/root/anaconda3/envs/swift/lib/python3.8/site-packages/accelerate/optimizer.py", line 107, in load_state_dict self.optimizer.load_state_dict(state_dict) File "/root/anaconda3/envs/swift/lib/python3.8/site-packages/torch/optim/optimizer.py", line 390, in load_state_dict raise ValueError("loaded state dict contains a parameter group " ValueError: loaded state dict contains a parameter group that doesn't match the size of optimizer's group

Jintao-Huang commented 1 year ago

好的 我测试一下 稍等

wickedsickk commented 1 year ago

好的 我测试一下 稍等

请问有复现这个问题吗

Jintao-Huang commented 1 year ago

复现了

wickedsickk commented 1 year ago

复现了

好的 辛苦跟进下哈