Open xiaojidaner opened 1 year ago
哥,解决了吗,我也遇到这个问题,看着是预测出来字符长度跟target的字符个数没匹配上,这样的话可以给预测的字符进行padding后在计算交叉熵吗
This is due to the calculation of CE loss of mismatched input & labels. It can be traced back to Seq2SeqTrainer().predict
.
In your case, the error tracing is from trainer_seq2seq.py
to python3.8/site-packages/transformers-4.28.0.dev0-py3.8.egg/transformers/trainer.py
, which implies that you import Seq2SeqTrainer
from the local file trainer_seq2seq.py
, but it invokes the method prediction_step
from the HF official script transformers/trainer.py
.
I think the ChatGLM model is not completely compatible with the HF official Seq2Seq
model's design, that's why ChatGLM provides the customized scripts ptuning/trainer_seq2seq.py
and ptuning/trainer.py
to overwrite the official implementation.
In conclusion, put both trainer.py
and trainer_seq2seq.py
in your ptuning
folder, and it should be solved. If you have done so but still got the error, you might forget to set predict_with_generate=True
in train_args
.
P.S. ChatGLM's customized scripts solved the problem by directly set loss=None
in the method predict_step
.
Is there an existing issue for this?
Current Behavior
predict_results = trainer.predict(predict_dataset, metric_key_prefix="predict", max_length=512, do_sample=True, top_p=0.7, temperature=0.95) File "/root/zamlpdata/myproject/gpt_model/ChatGLM-6B-main/ptuning/trainer_seq2seq.py", line 136, in predict return super().predict(test_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix) File "/root/anaconda3/envs/gpt/lib/python3.8/site-packages/transformers-4.28.0.dev0-py3.8.egg/transformers/trainer.py", line 3020, in predict output = eval_loop( File "/root/anaconda3/envs/gpt/lib/python3.8/site-packages/transformers-4.28.0.dev0-py3.8.egg/transformers/trainer.py", line 3125, in evaluation_loop loss, logits, labels = self.prediction_step(model, inputs, prediction_loss_only, ignore_keys=ignore_keys) File "/root/zamlpdata/myproject/gpt_model/ChatGLM-6B-main/ptuning/trainer_seq2seq.py", line 167, in prediction_step return super().prediction_step( File "/root/anaconda3/envs/gpt/lib/python3.8/site-packages/transformers-4.28.0.dev0-py3.8.egg/transformers/trainer.py", line 3380, in prediction_step loss, outputs = self.compute_loss(model, inputs, return_outputs=True) File "/root/anaconda3/envs/gpt/lib/python3.8/site-packages/transformers-4.28.0.dev0-py3.8.egg/transformers/trainer.py", line 2689, in compute_loss outputs = model(inputs) File "/root/anaconda3/envs/gpt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/modeling_chatglm.py", line 1185, in forward loss = loss_fct(shift_logits.view(-1, shift_logits.size(-1)), shift_labels.view(-1)) File "/root/anaconda3/envs/gpt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "/root/anaconda3/envs/gpt/lib/python3.8/site-packages/torch/nn/modules/loss.py", line 1174, in forward return F.cross_entropy(input, target, weight=self.weight, File "/root/anaconda3/envs/gpt/lib/python3.8/site-packages/torch/nn/functional.py", line 3029, in cross_entropy return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
Expected Behavior
No response
Steps To Reproduce
大佬可以帮忙看看这是啥问题吗
Environment
Anything else?
No response