chrisdoyleIE commented 4 years ago

🐛 Bug

I have pre-trained GPT2 on a summarisation dataset such that summarisation is a language modelling task i.e. input = concat( padded_to_max_len(body , "TL;DR:" , summary)).

For some reason, this error occurs when I try to generate via beam search using GPT2's with language modelling head. Here is my code:

from transformers import AutoTokenizer, GPT2LMHeadModel
import torch

# define tokenizer
tokenizer_kwargs = {"bos_token": "<|startoftext|>", "eos_token": "<|endoftext|>", "pad_token": "<|pad|>"}
tokenizer = AutoTokenizer.from_pretrained("gpt2", **tokenizer_kwargs)

# define model
model = GPT2LMHeadModel.from_pretrained("gpt2", pad_token_id=tokenizer.eos_token_id)

# define input 
input_ids = torch.tensor(tokenizer.encode("some text that ends in TL;DR:")).unsqueeze(0)

# attempt to generate
y_pred_tensor = model.generate(input_ids=input_ids,
                                                num_beams=5,
                                                early_stopping=True,
                                                no_repeat_ngram_size=2,
                                                max_length=100
                                                )

File "/Users/christopherdoyle/cp_projects/scribbl-ai/Scribbl/Scribbl/summarizer/models.py", line 147, in summarize
    y_pred_tensor = self.model.generate(input_ids=input_tensor,
  File "/Users/christopherdoyle/cp_projects/scribbl-ai/Scribbl/venv/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
    return func(*args, **kwargs)
  File "/Users/christopherdoyle/cp_projects/scribbl-ai/Scribbl/venv/lib/python3.8/site-packages/transformers/modeling_utils.py", line 1100, in generate
    output = self._generate_beam_search(
  File "/Users/christopherdoyle/cp_projects/scribbl-ai/Scribbl/venv/lib/python3.8/site-packages/transformers/modeling_utils.py", line 1499, in _generate_beam_search
    (token_id % vocab_size).item() is not eos_token_id for token_id in next_tokens[batch_idx]
UnboundLocalError: local variable 'next_tokens' referenced before assignment

Model I am using: GPT2LMHeadModel

Language I am using the model on (English, Chinese ...): en

Rather strangely, it works when max_length = 1024, but not with smaller values.

To reproduce

Running on CPU python==3.8, transformers==2.11.0, torch==1.5.0

LysandreJik commented 4 years ago

@patrickvonplaten, do you want to take a look?

patrickvonplaten commented 4 years ago

Hey @chrisdoyleIE ,

Using your code example above (I corrected some typos and missing imports), I am not able to reproduce the error. If a longer input is need to produce this error, please provide all necessary code to reproduce this error. Ideally, I should be able to copy paste the code into a console and get the same error as you :-)

chrisdoyleIE commented 4 years ago

Hey @patrickvonplaten ,

I'll do some digging and see if I can't reproduce it myself such that it's easily paste-able and then I can share this code (I currently have a few custom packages calling eachother which is hairier than i'd like and not trivial to insert into an issue).

MathewPerez commented 4 years ago

I found a similar error when doing summarization, and just wanted to follow up on this. I have been stuck on this for a little bit now and I just wanted to check if there was a simple user-end solution to this, maybe incorrect arguments, etc. This is a simplified notebook with the error: https://colab.research.google.com/drive/1Fj74x2NDJbhsty-oXzfhOCO185T80zw3?usp=sharing

patrickvonplaten commented 4 years ago

Thanks for the notebook @MathewPerez! Will checkit now

patrickvonplaten commented 4 years ago

Awesome, I can reproduce the error - will look at a fix now :-)

patrickvonplaten commented 4 years ago

The problem is that max_length is not bigger than cur_len so that model will not produce any text. This will fix the problem:

outputs = model.generate(input_ids=input_ids, num_beams=3, max_length=75)

huggingface / transformers

UnboundLocalError: local variable 'next_tokens' referenced before assignment when using Generate() #5118

🐛 Bug

To reproduce