huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.92k stars 26.99k forks source link

Encoder Decoder Model #8261

Closed arditobryan closed 4 years ago

arditobryan commented 4 years ago

Hi,

I am following the instructions written on the HuggingFace website to use an encoder-decoder model:

from transformers import EncoderDecoderModel, BertTokenizer import torch

tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = EncoderDecoderModel.from_encoder_decoder_pretrained('bert-base-uncased', 'bert-base-uncased') # initialize Bert2Bert from pre-trained checkpoints

#model.save_pretrained('/content/drive/My Drive/NLP/'+'model_1')

# forward
input_ids = torch.tensor(tokenizer.encode("Hello, my dog is cute", add_special_tokens=True)).unsqueeze(0)  # Batch size 1
outputs = model(input_ids=input_ids, decoder_input_ids=input_ids)

# training
outputs = model(input_ids=input_ids, decoder_input_ids=input_ids, labels=input_ids, return_dict=True)
#print(type(outputs)) #Seq2SeqLMOutput
loss, logits = outputs.loss, outputs.logits

# save and load from pretrained
#model.save_pretrained("bert2bert")
#model = EncoderDecoderModel.from_pretrained("bert2bert")

# generation
generated = model.generate(input_ids, decoder_start_token_id=model.config.decoder.pad_token_id)

generated
tensor([[   0, 1012, 1010, 1010, 1010, 1010, 1010, 1010, 1010, 1010, 1010, 1010,
         1010, 1010, 1010, 1010, 1010, 1010, 1010, 1010]])

However, I have no idea how to decode the generated output, can anybody pls help? Thank you

arditobryan commented 4 years ago

Maybe I found out, is it:

for i, sample_output in enumerate(generated):
  print("{}: {}".format(i, tokenizer.decode(sample_output, skip_special_tokens=True)))

?

patrickvonplaten commented 4 years ago

You can also make use of tokenizer.batch_decode(...)