Closed jerry3chen closed 3 years ago
Hey @jerry3chen,
Can you post a fully reproducible code snippet so that I can take a look? :-)
Hi @patrickvonplaten,
I will post some more detailed codes. But this is downstream task so it is probably not ideal to have all of the code. I will just post down all of the parts that involve the t5model.
Here is where I initialized the t5 model
enc2 = MT5ForConditionalGeneration.from_pretrained('google/mt5-small')
Then is it passed to a bigger model:
model = Gat2Seq(enc,enc2,vocab.word2id('<pad>'),vocab.word2id('</s>'))
class Gat2Seq(nn.Module): def __init__(self, encoder, encoder2, pad_idx, eos_idx, teacher_forcing = 0.5): super().__init__() self.encoder = encoder self.encoder2 = encoder2
During training, I have:
context = self.encoder(graph, art_lengths) outputs = self.encoder2(inputs_embeds=context, attention_mask=input_mask, labels=padded_labels)
Where context is the shape of [8, 50, 512] coming from previous encoder(8 is the batch size, 50 is the sentence max length, 512 is the embedding size default from mt5tokenizer). padded_labels has shape of [8, 20](8 is the batch size, 20 is the maximum target sequence length). It is batch of target sentence token_ids that I want the model to generate. I wanted the t5model to treated the context as embedded tokens and does it's own encode/decode for text generation.
The training step works fine and I am able to see reasonable decrease in outputs.loss.
Finally when I have some trained models, I ran this time to generate text:
outputs = self.encoder2.generate(input_ids=None, inputs_embeds=context, attention_mask=input_mask, bos_token_id=0, pad_token_id=0, eos_token_id=1)
Where context here is exact the same as the one used in training.
However, I will get the following error when program hits the generation line:
File "pred.py", line 452, in
main() File "pred.py", line 448, in main setup_predicting(model, data_loader, hps, vocab, f.split('/')[-1] + '_model_output.txt') File "pred.py", line 64, in setup_predicting run_predicting(model, data_loader, hps, vocab, save_f) File "pred.py", line 118, in run_predicting raise e File "pred.py", line 106, in run_predicting outputs = model.forward(G,lengths,labels,predicting=True) # [n_snodes, 2] File "/scratch/jerryc/jerryc/gat2seq/HeterSumGraph-master-mod-att-TV-char/HiGraphMod.py", line 470, in forward outputs = self.encoder2.generate(input_ids=None, inputs_embeds=context, attention_mask=input_mask, bos_token_id=0, pad_token_id=0, eos_token_id=1) File "/scratch/jerryc/jerryc/venv_py3.7/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, *kwargs) File "/scratch/jerryc/jerryc/venv_py3.7/lib/python3.7/site-packages/transformers/generation_utils.py", line 913, in generate input_ids, decoder_start_token_id=decoder_start_token_id, bos_token_id=bos_token_id File "/scratch/jerryc/jerryc/venv_py3.7/lib/python3.7/site-packages/transformers/generation_utils.py", line 422, in _prepare_decoder_input_ids_for_generation torch.ones((input_ids.shape[0], 1), dtype=torch.long, device=input_ids.device) decoder_start_token_id AttributeError: 'NoneType' object has no attribute 'shape'
Hope this is enough for you to diagnose the issue. Thanks, Jerry
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
hello, I face the same problem. Could you give me any suggestions?
Hey @jerry3chen, @yuto3o,
Could you please provide a complete, but minimal reproducible code snippet, so that I can easily reproduce the bug?
Small non-executeable code snippets are not enough to efficiently debug the problem.
Thanks!
@patrickvonplaten @yuto3o @jerry3chen
Hello, I also face the same problem.
However, I found that the error doesn't occur if I pass decoder_input_ids
consisting of pad_token_id
to the generate
.
The minimal reproducible code snippets are as follows.
My environment
transformers 4.12.0
torch 1.8.0
reproducible code for the error
from transformers import (
T5ForConditionalGeneration,
T5Tokenizer,
)
model = T5ForConditionalGeneration.from_pretrained("sonoisa/t5-base-japanese")
tokenizer = T5Tokenizer.from_pretrained("sonoisa/t5-base-japanese", is_fast=True)
# the example sentence is "It's sunny today" in English
tokenized_inputs = tokenizer(["今日は良い天気です"], return_tensors='pt')
# create input embedding instead of passing input_ids
inputs_embeds = model.get_input_embeddings()(tokenized_inputs["input_ids"])
output_ids = model.generate(
inputs_embeds=inputs_embeds,
attention_mask=tokenized_inputs["attention_mask"]
)
AttributeError Traceback (most recent call last)
in 1 inputs_embeds = model.get_input_embeddings()(tokenized_inputs["input_ids"]) ----> 2 output_ids = model.generate( 3 inputs_embeds=inputs_embeds, 4 attention_mask=tokenized_inputs["attention_mask"] 5 ) ~/anaconda3/envs/aitd/lib/python3.8/site-packages/torch/autograd/grad_mode.py in decorate_context(*args, **kwargs) 25 def decorate_context(*args, **kwargs): 26 with self.__class__(): ---> 27 return func(*args, **kwargs) 28 return cast(F, decorate_context) 29 ~/anaconda3/envs/aitd/lib/python3.8/site-packages/transformers/generation_utils.py in generate(self, input_ids, max_length, min_length, do_sample, early_stopping, num_beams, temperature, top_k, top_p, repetition_penalty, bad_words_ids, bos_token_id, pad_token_id, eos_token_id, length_penalty, no_repeat_ngram_size, encoder_no_repeat_ngram_size, num_return_sequences, max_time, max_new_tokens, decoder_start_token_id, use_cache, num_beam_groups, diversity_penalty, prefix_allowed_tokens_fn, output_attentions, output_hidden_states, output_scores, return_dict_in_generate, forced_bos_token_id, forced_eos_token_id, remove_invalid_values, synced_gpus, **model_kwargs) 911 input_ids = model_kwargs.pop("decoder_input_ids") 912 else: --> 913 input_ids = self._prepare_decoder_input_ids_for_generation( 914 input_ids, decoder_start_token_id=decoder_start_token_id, bos_token_id=bos_token_id 915 ) ~/anaconda3/envs/aitd/lib/python3.8/site-packages/transformers/generation_utils.py in _prepare_decoder_input_ids_for_generation(self, input_ids, decoder_start_token_id, bos_token_id) 422 decoder_start_token_id = self._get_decoder_start_token_id(decoder_start_token_id, bos_token_id) 423 decoder_input_ids = ( --> 424 torch.ones((input_ids.shape[0], 1), dtype=torch.long, device=input_ids.device) * decoder_start_token_id 425 ) 426 return decoder_input_ids AttributeError: 'NoneType' object has no attribute 'shape'
How to fix it
from transformers import (
T5ForConditionalGeneration,
T5Tokenizer,
)
model = T5ForConditionalGeneration.from_pretrained("sonoisa/t5-base-japanese")
tokenizer = T5Tokenizer.from_pretrained("sonoisa/t5-base-japanese", is_fast=True)
tokenized_inputs = tokenizer(["今日は良い天気です"], return_tensors='pt') # It's sunny today
inputs_embeds = model.get_input_embeddings()(tokenized_inputs["input_ids"])
# **NOTE**: pad_token_id is used as decoder_start_token_id
dummy_decoder_input_ids = torch.tensor([[tokenizer.pad_token_id]])
output_ids = model.generate(
inputs_embeds=inputs_embeds,
attention_mask=tokenized_inputs["attention_mask"],
decoder_input_ids=dummy_decoder_input_ids
)
output_ids
tensor([[ 0, 32099, 876, 4, 5, 2262, 32098, 876, 4, 2262, 1]])
When I pass input_ids
to generate
I can get the same result when I pass input_ids
.
from transformers import (
T5ForConditionalGeneration,
T5Tokenizer,
)
model = T5ForConditionalGeneration.from_pretrained("sonoisa/t5-base-japanese")
tokenizer = T5Tokenizer.from_pretrained("sonoisa/t5-base-japanese", is_fast=True)
tokenized_inputs = tokenizer(["今日は良い天気です"], return_tensors='pt') # It's sunny today
output_ids = model.generate(
input_ids=tokenized_inputs["input_ids"],
attention_mask=tokenized_inputs["attention_mask"]
)
output_ids
tensor([[ 0, 32099, 876, 4, 5, 2262, 32098, 876, 4, 2262, 1]])
@ichiroex,
Thanks for the nicely reproducible code snippet - this is indeed a bug and should be fixed.
PR to fix this: #14443
@patrickvonplaten Thank you!!
Hi there,
I trained a MT5ForConditionalGeneration model. During training, I used my own embeddings for encoding (but default embeddings for decoding). However, when I try to generate output using generate function, it will give me an error message. I will post the code and error message in the following:
Here is the code for model training:
outputs = self.encoder2(inputs_embeds=context, attention_mask=input_mask, labels=padded_labels)
Where the context is similar to batch of token_ids but instead they are embeddings. The labels are target sequence token_ids. The training works fine without any issues.And here is the line I tried to generate using the model:
outputs = self.encoder2.generate(input_ids=None, inputs_embeds=context, attention_mask=input_mask, bos_token_id=0, pad_token_id=0, eos_token_id=1)
And once the program hits the above line, I will get the following error message:
It seems the model is not handling this case property. Any help would be appreciated. Thanks