/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/transformers/optimization.py:411: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set no_deprecation_warning=True to disable this warning
warnings.warn(
0%| | 0/3000 [00:00<?, ?it/s]07/06/2023 11:20:35 - WARNING - transformers_modules.chatglm2-6b.modeling_chatglm - use_cache=True is incompatible with gradient checkpointing. Setting use_cache=False...
0%|▌ | 10/3000 [00:47<3:54:43, 4.71s/it]Traceback (most recent call last):
File "/home/chenjk/study/ChatGLM2-6B/ptuning/main.py", line 411, in
main()
File "/home/chenjk/study/ChatGLM2-6B/ptuning/main.py", line 350, in main
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "/home/chenjk/study/ChatGLM2-6B/ptuning/trainer.py", line 1635, in train
return inner_training_loop(
File "/home/chenjk/study/ChatGLM2-6B/ptuning/trainer.py", line 1981, in _inner_training_loop
self._maybe_log_save_evaluate(tr_loss, model, trial, epoch, ignore_keys_for_eval)
File "/home/chenjk/study/ChatGLM2-6B/ptuning/trainer.py", line 2220, in _maybe_log_save_evaluate
logs["learning_rate"] = self._get_learning_rate()
File "/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/transformers/trainer_pt_utils.py", line 841, in _get_learning_rate
if self.is_deepspeed_enabled:
AttributeError: 'Seq2SeqTrainer' object has no attribute 'is_deepspeed_enabled'
0%|▌ | 10/3000 [00:47<3:58:43, 4.79s/it]
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 23216) of binary: /home/chenjk/miniconda3/envs/chatglm2/bin/python
Traceback (most recent call last):
File "/home/chenjk/miniconda3/envs/chatglm2/bin/torchrun", line 8, in
sys.exit(main())
File "/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/torch/distributed/run.py", line 794, in main
run(args)
File "/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
Is there an existing issue for this?
Current Behavior
/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/transformers/optimization.py:411: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set
main()
File "/home/chenjk/study/ChatGLM2-6B/ptuning/main.py", line 350, in main
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "/home/chenjk/study/ChatGLM2-6B/ptuning/trainer.py", line 1635, in train
return inner_training_loop(
File "/home/chenjk/study/ChatGLM2-6B/ptuning/trainer.py", line 1981, in _inner_training_loop
self._maybe_log_save_evaluate(tr_loss, model, trial, epoch, ignore_keys_for_eval)
File "/home/chenjk/study/ChatGLM2-6B/ptuning/trainer.py", line 2220, in _maybe_log_save_evaluate
logs["learning_rate"] = self._get_learning_rate()
File "/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/transformers/trainer_pt_utils.py", line 841, in _get_learning_rate
if self.is_deepspeed_enabled:
AttributeError: 'Seq2SeqTrainer' object has no attribute 'is_deepspeed_enabled'
0%|▌ | 10/3000 [00:47<3:58:43, 4.79s/it]
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 23216) of binary: /home/chenjk/miniconda3/envs/chatglm2/bin/python
Traceback (most recent call last):
File "/home/chenjk/miniconda3/envs/chatglm2/bin/torchrun", line 8, in
sys.exit(main())
File "/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/torch/distributed/run.py", line 794, in main
run(args)
File "/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/chenjk/miniconda3/envs/chatglm2/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
no_deprecation_warning=True
to disable this warning warnings.warn( 0%| | 0/3000 [00:00<?, ?it/s]07/06/2023 11:20:35 - WARNING - transformers_modules.chatglm2-6b.modeling_chatglm -use_cache=True
is incompatible with gradient checkpointing. Settinguse_cache=False
... 0%|▌ | 10/3000 [00:47<3:54:43, 4.71s/it]Traceback (most recent call last): File "/home/chenjk/study/ChatGLM2-6B/ptuning/main.py", line 411, inmain.py FAILED
Failures: