huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
133.89k stars 26.78k forks source link

[Bart] when output_paste=False BartForConditionalGeneration raises confusing error #3508

Closed manishiitg closed 4 years ago

manishiitg commented 4 years ago

🐛 Bug

Information

i am using BART

Language I am using the model on (English, Chinese ...): English

The problem arises when using:

The tasks I am working on is:

Summarization

To reproduce

Steps to reproduce the behavior:

tokenizer = BartTokenizer.from_pretrained('bart-large-mnli')
model = BartForConditionalGeneration.from_pretrained('bart-large-mnli')

article_input_ids = tokenizer.batch_encode_plus([LONG_BORING_TENNIS_ARTICLE], return_tensors='pt', max_length=1024)['input_ids'].to(torch_device)
summary_ids = model.generate(article_input_ids,
                             num_beams=4,
                             length_penalty=2.0,
                             max_length=100,
                             early_stopping=True)

print([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in summary_ids])

i get error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-27-2df1c6607426> in <module>()
      4                              length_penalty=2.0,
      5                              max_length=100,
----> 6                              early_stopping=True)
      7 
      8 print([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in summary_ids])

3 frames
/usr/local/lib/python3.6/dist-packages/transformers/modeling_bart.py in _reorder_cache(past, beam_idx)
    921     @staticmethod
    922     def _reorder_cache(past, beam_idx):
--> 923         ((enc_out, enc_mask), decoder_cached_states) = past
    924         reordered_past = []
    925         for layer_past in decoder_cached_states:

ValueError: too many values to unpack (expected 2)

this works with bart-large-cnn but gives error with other models?

andr-ec commented 4 years ago

@sshleifer I'm seeing this as well. It doesn't happen if num_beams=1. Might have to do with the recent generation and bart changes. Only started happening in the last week or so.

sshleifer commented 4 years ago

Thanks for contributing!

A few thoughts:

  1. if you pass output_past=True to BartForConditionalGeneration.from_pretrained, the code works.
  2. We only expect 'bart-large-xsum' and 'bart-large-cnn' to generate high quality summaries.
  3. The error message/traceback should be improved. Feel free to send a PR if you'd like.
  4. Thanks for copy pasting usable code, it made this really easy to verify :) I added "```python" at the beginning to prettify.

Working example

copy paste LONG_BORING_TENNIS_ARTICLE

model_name = 'bart-large-mnli'
from transformers import *
torch_device='cpu'
tokenizer = BartTokenizer.from_pretrained(model_name)
model = BartForConditionalGeneration.from_pretrained(model_name, output_past=True)
article_input_ids = tokenizer.batch_encode_plus([LONG_BORING_TENNIS_ARTICLE], return_tensors='pt', max_length=1024)['input_ids'].to(torch_device)
summary_ids = model.generate(article_input_ids,
                             num_beams=4,
                             length_penalty=2.0,
                             max_length=100,
                             early_stopping=True)

print([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in summary_ids])
Javier-Jimenez99 commented 3 years ago

This wasn't solved. I am using a trainer on BART and I have tried to use use_cache, but it still doesn't work.