modelscope / ms-swift

Use PEFT or Full-parameter to finetune 350+ LLMs or 90+ MLLMs. (Qwen2.5, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)
https://swift.readthedocs.io/zh-cn/latest/Instruction/index.html
Apache License 2.0
3.49k stars 299 forks source link

微调seqgpt后,使用微调过后的ckpt作为resume_from_ckpt参数,报错 #107

Closed wickedsickk closed 11 months ago

wickedsickk commented 11 months ago

Traceback (most recent call last): File "src/llm_sft.py", line 242, in llm_sft(args) File "src/llm_sft.py", line 218, in llm_sft trainer.train(training_args.resume_from_checkpoint) File "/root/anaconda3/envs/swift/lib/python3.8/site-packages/transformers/trainer.py", line 1553, in train return inner_training_loop( File "/root/anaconda3/envs/swift/lib/python3.8/site-packages/transformers/trainer.py", line 1705, in _inner_training_loop self._load_optimizer_and_scheduler(resume_from_checkpoint) File "/root/anaconda3/envs/swift/lib/python3.8/site-packages/transformers/trainer.py", line 2496, in _load_optimizer_and_scheduler self.optimizer.load_state_dict( File "/root/anaconda3/envs/swift/lib/python3.8/site-packages/accelerate/optimizer.py", line 107, in load_state_dict self.optimizer.load_state_dict(state_dict) File "/root/anaconda3/envs/swift/lib/python3.8/site-packages/torch/optim/optimizer.py", line 390, in load_state_dict raise ValueError("loaded state dict contains a parameter group " ValueError: loaded state dict contains a parameter group that doesn't match the size of optimizer's group

Jintao-Huang commented 11 months ago

好的 我测试一下 稍等

wickedsickk commented 11 months ago

好的 我测试一下 稍等

请问有复现这个问题吗

Jintao-Huang commented 11 months ago

复现了

wickedsickk commented 11 months ago

复现了

好的 辛苦跟进下哈