THUDM / ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
Other
15.69k stars 1.85k forks source link

[BUG/Help] 运行train.sh报错:AttributeError: 'Seq2SeqTrainer' object has no attribute 'is_deepspeed_enabled' #216

Open harbor1981 opened 1 year ago

harbor1981 commented 1 year ago

Is there an existing issue for this?

Current Behavior

/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/transformers/optimization.py:411: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set no_deprecation_warning=True to disable this warning warnings.warn( 0%| | 0/3000 [00:00<?, ?it/s]07/06/2023 11:20:35 - WARNING - transformers_modules.chatglm2-6b.modeling_chatglm - use_cache=True is incompatible with gradient checkpointing. Setting use_cache=False... 0%|▌ | 10/3000 [00:47<3:54:43, 4.71s/it]Traceback (most recent call last): File "/home/chenjk/study/ChatGLM2-6B/ptuning/main.py", line 411, in main() File "/home/chenjk/study/ChatGLM2-6B/ptuning/main.py", line 350, in main train_result = trainer.train(resume_from_checkpoint=checkpoint) File "/home/chenjk/study/ChatGLM2-6B/ptuning/trainer.py", line 1635, in train return inner_training_loop( File "/home/chenjk/study/ChatGLM2-6B/ptuning/trainer.py", line 1981, in _inner_training_loop self._maybe_log_save_evaluate(tr_loss, model, trial, epoch, ignore_keys_for_eval) File "/home/chenjk/study/ChatGLM2-6B/ptuning/trainer.py", line 2220, in _maybe_log_save_evaluate logs["learning_rate"] = self._get_learning_rate() File "/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/transformers/trainer_pt_utils.py", line 841, in _get_learning_rate if self.is_deepspeed_enabled: AttributeError: 'Seq2SeqTrainer' object has no attribute 'is_deepspeed_enabled' 0%|▌ | 10/3000 [00:47<3:58:43, 4.79s/it] ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 23216) of binary: /home/chenjk/miniconda3/envs/chatglm2/bin/python Traceback (most recent call last): File "/home/chenjk/miniconda3/envs/chatglm2/bin/torchrun", line 8, in sys.exit(main()) File "/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper return f(*args, **kwargs) File "/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/torch/distributed/run.py", line 794, in main run(args) File "/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/torch/distributed/run.py", line 785, in run elastic_launch( File "/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 134, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

main.py FAILED

Failures:

### Expected Behavior _No response_ ### Steps To Reproduce web.py能正常运行,train.sh运行就报错 ### Environment ```markdown - OS:ubuntu18.04 - Python:3.9 - Transformers:4.30 - PyTorch:2.0 - CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :True ``` ### Anything else? _No response_
daxian-lh commented 1 year ago

将transformers降级,pip install transformers==4.27.1