Closed belle9217 closed 1 year ago
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.
Closing the issue, since no updates observed. Feel free to re-open if you need any further assistance.
提交前必须检查以下项目
问题类型
其他问题
基础模型
None
操作系统
Linux
详细描述问题
使用deepspeed zero3+trainer训练好模型之后,使用resume_from_checkpoint无法加载优化器里的参数,检查保存的权重信息,zero_pp_rank_0_mp_rank_00_optim_states.pt保存优化器的这个文件大小为4.6M,因此在加载的时候报错
依赖情况(代码类问题务必提供)
peft 0.5.0.dev0 torch 2.0.0 torchaudio 2.0.0 torchvision 0.15.0 transformers 4.32.0.dev0
运行日志或截图
File "run_clm_peft.py", line 647, in
main()
File "run_clm_peft.py", line 615, in main
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "/root/miniconda3/envs/xzh_deepspeed/lib/python3.8/site-packages/transformers/trainer.py", line 1544, in train
return inner_training_loop(
File "/root/miniconda3/envs/xzh_deepspeed/lib/python3.8/site-packages/transformers/trainer.py", line 1667, in _inner_training_loop
model, self.optimizer, self.lr_scheduler = self.accelerator.prepare(
File "/root/miniconda3/envs/xzh_deepspeed/lib/python3.8/site-packages/accelerate/accelerator.py", line 1198, in prepare
result = self._prepare_deepspeed(*args)
File "/root/miniconda3/envs/xzh_deepspeed/lib/python3.8/site-packages/accelerate/accelerator.py", line 1537, in _preparedeepspeed
engine, optimizer, , lr_scheduler = deepspeed.initialize(kwargs)
File "/root/miniconda3/envs/xzh_deepspeed/lib/python3.8/site-packages/deepspeed/init.py", line 171, in initialize
engine = DeepSpeedEngine(args=args,
File "/root/miniconda3/envs/xzh_deepspeed/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 310, in init
self._configure_optimizer(optimizer, model_parameters)
File "/root/miniconda3/envs/xzh_deepspeed/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1198, in _configure_optimizer
basic_optimizer = self._configure_basic_optimizer(model_parameters)
File "/root/miniconda3/envs/xzh_deepspeed/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1250, in _configure_basic_optimizer
optimizer = torch.optim.AdamW(model_parameters, optimizer_parameters)
File "/root/miniconda3/envs/xzh_deepspeed/lib/python3.8/site-packages/torch/optim/adamw.py", line 50, in init
super().init(params, defaults)
File "/root/miniconda3/envs/xzh_deepspeed/lib/python3.8/site-packages/torch/optim/optimizer.py", line 187, in init
raise ValueError("optimizer got an empty parameter list")
ValueError: optimizer got an empty parameter list