huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.62k stars 26.92k forks source link

Strange start token in MT5 generation #9792

Closed tomdzh closed 3 years ago

tomdzh commented 3 years ago

Environment info

Who can help

Text Generation: @patrickvonplaten @TevenLeScao T5: @patrickvonplaten

Information

Model I am using (Bert, XLNet ...): MT5

To reproduce

Steps to reproduce the behavior:

from transformers import MT5ForConditionalGeneration, MT5Tokenizer

model = MT5ForConditionalGeneration.from_pretrained("google/mt5-small")
tokenizer = MT5Tokenizer.from_pretrained("google/mt5-small")

text = 'summarize: Bidirectional Encoder Representations from Transformers is a Transformer-based machine learning technique for natural language processing pre-training developed by Google'
inputs = tokenizer([text], max_length=512, truncation=True, return_tensors='pt')
summary_ids = model.generate(inputs['input_ids'])
print([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=True) for g in summary_ids])

The output I got is ['.']

Expected behavior

I tried a few input texts. The generated output always start with , which doesn't happen in t5 generation. Anyone knows how to solve it?

tomdzh commented 3 years ago

One more thing: this behavior still persists after I fine tuned the model on my own dataset

patil-suraj commented 3 years ago

Hi @tomdzh

first of all, unlike the original T5, mT5 is not pre-trained on any supervised downstream task (like summarization, translation etc), so generation work without fine-tuning it.

Also, it would be hard to answer why it's happening in fine-tuned model without looking at any code.

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale and been closed because it has not had recent activity. Thank you for your contributions.

If you think this still needs to be addressed please comment on this thread.