👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
并行训练时报错,求解答
Traceback (most recent call last):
File "/home/pyh/ie2/PaddleNLP-release-2.5/model_zoo/uie/finetune.py", line 245, in
main()
File "/home/pyh/ie2/PaddleNLP-release-2.5/model_zoo/uie/finetune.py", line 184, in main
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "/root/anaconda3/envs/pybak/lib/python3.9/site-packages/paddlenlp/trainer/trainer.py", line 716, in train
self._maybe_log_save_evaluate(tr_loss, model, epoch, ignore_keys_for_eval, inputs=inputs)
File "/root/anaconda3/envs/pybak/lib/python3.9/site-packages/paddlenlp/trainer/trainer.py", line 810, in _maybe_log_save_evaluate
tr_loss_scalar = self._nested_gather(tr_loss).mean().item()
File "/root/anaconda3/envs/pybak/lib/python3.9/site-packages/paddlenlp/trainer/trainer.py", line 1954, in _nested_gather
tensors = distributed_concat(tensors)
File "/root/anaconda3/envs/pybak/lib/python3.9/site-packages/paddlenlp/trainer/utils/helper.py", line 45, in distributed_concat
concat = paddle.concat(output_tensors, axis=0)
File "/root/anaconda3/envs/pybak/lib/python3.9/site-packages/paddle/tensor/manipulation.py", line 1121, in concat
return _C_ops.concat(input, axis)
ValueError: (InvalidArgument) The axis is expected to be in range of [0, 0), but got 0
[Hint: Expected axis >= -rank && axis < rank == true, but received axis >= -rank && axis < rank:0 != true:1.] (at ../paddle/phi/infermeta/multiary.cc:961)
I0811 17:28:05.926939 101147 tcp_store.cc:273] receive shutdown event and so quit from MasterDaemon run loop
LAUNCH INFO 2023-08-11 17:28:07,606 Pod failed
请提出你的问题
并行训练时报错,求解答 Traceback (most recent call last): File "/home/pyh/ie2/PaddleNLP-release-2.5/model_zoo/uie/finetune.py", line 245, in
main()
File "/home/pyh/ie2/PaddleNLP-release-2.5/model_zoo/uie/finetune.py", line 184, in main
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "/root/anaconda3/envs/pybak/lib/python3.9/site-packages/paddlenlp/trainer/trainer.py", line 716, in train
self._maybe_log_save_evaluate(tr_loss, model, epoch, ignore_keys_for_eval, inputs=inputs)
File "/root/anaconda3/envs/pybak/lib/python3.9/site-packages/paddlenlp/trainer/trainer.py", line 810, in _maybe_log_save_evaluate
tr_loss_scalar = self._nested_gather(tr_loss).mean().item()
File "/root/anaconda3/envs/pybak/lib/python3.9/site-packages/paddlenlp/trainer/trainer.py", line 1954, in _nested_gather
tensors = distributed_concat(tensors)
File "/root/anaconda3/envs/pybak/lib/python3.9/site-packages/paddlenlp/trainer/utils/helper.py", line 45, in distributed_concat
concat = paddle.concat(output_tensors, axis=0)
File "/root/anaconda3/envs/pybak/lib/python3.9/site-packages/paddle/tensor/manipulation.py", line 1121, in concat
return _C_ops.concat(input, axis)
ValueError: (InvalidArgument) The axis is expected to be in range of [0, 0), but got 0
[Hint: Expected axis >= -rank && axis < rank == true, but received axis >= -rank && axis < rank:0 != true:1.] (at ../paddle/phi/infermeta/multiary.cc:961)
I0811 17:28:05.926939 101147 tcp_store.cc:273] receive shutdown event and so quit from MasterDaemon run loop LAUNCH INFO 2023-08-11 17:28:07,606 Pod failed