huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
133.64k stars 26.7k forks source link

warm-starting encoder-decoder models using EncoderDecoderModel always giving an empty string after fine-tuning #29824

Closed kawsar-pie closed 5 months ago

kawsar-pie commented 7 months ago

System Info

Hi @ArthurZucker and @patrickvonplaten, I am trying to train a seq2seq model using EncoderDecoderModel class and found this blog very helpful. Thanks to @patrickvonplaten for the nice explanation. Following this blog I fine-tuned a seq2seq model where I used a BERT (BanglaBERT an Electra) model as encoder and XGLM as decoder using BanglaParaphrase data. But after fine-tuning the model always generates an empty string. Now I do not understand where the problem is. Please help me to find the bug in the code.

Thanks.

Who can help?

@ArthurZucker @patrickvonplaten

Information

Tasks

Reproduction

Here is my notebook

Expected behavior

Input-output for my code: {'target': 'সিপিও আহত থাকায় যুদ্ধ পরিচালনার দায়িত্ব এসে পড়েছিল সেম্প্রোনিয়াসের কাঁধে।', 'pred_target': ''}

which should be something like this (should give the paraphrased sentence according to the input sentence in Bangla): {'target': 'সিপিও আহত থাকায় যুদ্ধ পরিচালনার দায়িত্ব এসে পড়েছিল সেম্প্রোনিয়াসের কাঁধে।', 'pred_target': 'সিপিও কর্তৃক আহত হয়ে সেমপ্রোনিয়াসের কাঁধে যুদ্ধ পরিচালনার দায়িত্ব আসে।'}

amyeroberts commented 6 months ago

Hi @kawsar-pie, thanks for raising an issue!

This is a question best placed in our forums. We try to reserve the github issues for feature requests and bug reports.

kawsar-pie commented 6 months ago

Thank you @amyeroberts for your comment. I created a new topic in the forum for this issue. I hope to find a solution from there.

github-actions[bot] commented 5 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.