manishiitg commented 4 years ago

🐛 Bug

Information

i am using BART

Language I am using the model on (English, Chinese ...): English

The problem arises when using:

[x ] the official example scripts: (give details below)
[ ] my own modified scripts: (give details below)

The tasks I am working on is:

[ ] an official GLUE/SQUaD task: (give the name)
[ ] my own task or dataset: (give details below)

Summarization

To reproduce

Steps to reproduce the behavior:

tokenizer = BartTokenizer.from_pretrained('bart-large-mnli')
model = BartForConditionalGeneration.from_pretrained('bart-large-mnli')

article_input_ids = tokenizer.batch_encode_plus([LONG_BORING_TENNIS_ARTICLE], return_tensors='pt', max_length=1024)['input_ids'].to(torch_device)
summary_ids = model.generate(article_input_ids,
                             num_beams=4,
                             length_penalty=2.0,
                             max_length=100,
                             early_stopping=True)

print([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in summary_ids])

i get error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-27-2df1c6607426> in <module>()
      4                              length_penalty=2.0,
      5                              max_length=100,
----> 6                              early_stopping=True)
      7 
      8 print([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in summary_ids])

3 frames
/usr/local/lib/python3.6/dist-packages/transformers/modeling_bart.py in _reorder_cache(past, beam_idx)
    921     @staticmethod
    922     def _reorder_cache(past, beam_idx):
--> 923         ((enc_out, enc_mask), decoder_cached_states) = past
    924         reordered_past = []
    925         for layer_past in decoder_cached_states:

ValueError: too many values to unpack (expected 2)

this works with bart-large-cnn but gives error with other models?

andr-ec commented 4 years ago

@sshleifer I'm seeing this as well. It doesn't happen if num_beams=1. Might have to do with the recent generation and bart changes. Only started happening in the last week or so.

sshleifer commented 4 years ago

Thanks for contributing!

A few thoughts:

if you pass output_past=True to BartForConditionalGeneration.from_pretrained, the code works.
We only expect 'bart-large-xsum' and 'bart-large-cnn' to generate high quality summaries.
The error message/traceback should be improved. Feel free to send a PR if you'd like.
Thanks for copy pasting usable code, it made this really easy to verify :) I added "```python" at the beginning to prettify.

Working example

copy paste LONG_BORING_TENNIS_ARTICLE

model_name = 'bart-large-mnli'
from transformers import *
torch_device='cpu'
tokenizer = BartTokenizer.from_pretrained(model_name)
model = BartForConditionalGeneration.from_pretrained(model_name, output_past=True)
article_input_ids = tokenizer.batch_encode_plus([LONG_BORING_TENNIS_ARTICLE], return_tensors='pt', max_length=1024)['input_ids'].to(torch_device)
summary_ids = model.generate(article_input_ids,
                             num_beams=4,
                             length_penalty=2.0,
                             max_length=100,
                             early_stopping=True)

print([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in summary_ids])

Javier-Jimenez99 commented 3 years ago

This wasn't solved. I am using a trainer on BART and I have tried to use use_cache, but it still doesn't work.

huggingface / transformers

[Bart] when output_paste=False BartForConditionalGeneration raises confusing error #3508

🐛 Bug

Information

To reproduce

Working example