ValueError: Expected input batch_size (504) to match target batch_size (136).

xiaojidaner commented 1 year ago

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

predict_results = trainer.predict(predict_dataset, metric_key_prefix="predict", max_length=512, do_sample=True, top_p=0.7, temperature=0.95) File "/root/zamlpdata/myproject/gpt_model/ChatGLM-6B-main/ptuning/trainer_seq2seq.py", line 136, in predict return super().predict(test_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix) File "/root/anaconda3/envs/gpt/lib/python3.8/site-packages/transformers-4.28.0.dev0-py3.8.egg/transformers/trainer.py", line 3020, in predict output = eval_loop( File "/root/anaconda3/envs/gpt/lib/python3.8/site-packages/transformers-4.28.0.dev0-py3.8.egg/transformers/trainer.py", line 3125, in evaluation_loop loss, logits, labels = self.prediction_step(model, inputs, prediction_loss_only, ignore_keys=ignore_keys) File "/root/zamlpdata/myproject/gpt_model/ChatGLM-6B-main/ptuning/trainer_seq2seq.py", line 167, in prediction_step return super().prediction_step( File "/root/anaconda3/envs/gpt/lib/python3.8/site-packages/transformers-4.28.0.dev0-py3.8.egg/transformers/trainer.py", line 3380, in prediction_step loss, outputs = self.compute_loss(model, inputs, return_outputs=True) File "/root/anaconda3/envs/gpt/lib/python3.8/site-packages/transformers-4.28.0.dev0-py3.8.egg/transformers/trainer.py", line 2689, in compute_loss outputs = model(inputs) File "/root/anaconda3/envs/gpt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/modeling_chatglm.py", line 1185, in forward loss = loss_fct(shift_logits.view(-1, shift_logits.size(-1)), shift_labels.view(-1)) File "/root/anaconda3/envs/gpt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "/root/anaconda3/envs/gpt/lib/python3.8/site-packages/torch/nn/modules/loss.py", line 1174, in forward return F.cross_entropy(input, target, weight=self.weight, File "/root/anaconda3/envs/gpt/lib/python3.8/site-packages/torch/nn/functional.py", line 3029, in cross_entropy return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)

Expected Behavior

No response

Steps To Reproduce

大佬可以帮忙看看这是啥问题吗

Environment

- OS:
- Python:3.8
- Transformers:
- PyTorch:1.18
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

Anything else?

No response

laiqinghan commented 1 year ago

哥，解决了吗，我也遇到这个问题，看着是预测出来字符长度跟target的字符个数没匹配上，这样的话可以给预测的字符进行padding后在计算交叉熵吗

zhaoqf123 commented 1 year ago

This is due to the calculation of CE loss of mismatched input & labels. It can be traced back to Seq2SeqTrainer().predict.

In your case, the error tracing is from trainer_seq2seq.py to python3.8/site-packages/transformers-4.28.0.dev0-py3.8.egg/transformers/trainer.py, which implies that you import Seq2SeqTrainer from the local file trainer_seq2seq.py, but it invokes the method prediction_step from the HF official script transformers/trainer.py.

I think the ChatGLM model is not completely compatible with the HF official Seq2Seq model's design, that's why ChatGLM provides the customized scripts ptuning/trainer_seq2seq.py and ptuning/trainer.py to overwrite the official implementation.

In conclusion, put both trainer.py and trainer_seq2seq.py in your ptuning folder, and it should be solved. If you have done so but still got the error, you might forget to set predict_with_generate=True in train_args.

P.S. ChatGLM's customized scripts solved the problem by directly set loss=None in the method predict_step.

THUDM / ChatGLM-6B