huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
132.4k stars 26.37k forks source link

Encoder Decoder Model didn't return a reasonable result #10831

Closed C7ABT closed 3 years ago

C7ABT commented 3 years ago

Hello, I tried the example code in the official website as below.

code

`from transformers import EncoderDecoderModel, BertTokenizer import torch tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') model = EncoderDecoderModel.from_encoder_decoder_pretrained('bert-base-uncased', 'bert-base-uncased') input_ids = torch.tensor(tokenizer.encode("Hello, my dog is cute", add_special_tokens=True)).unsqueeze(0) outputs = model(input_ids=input_ids, decoder_input_ids=input_ids) outputs = model(input_ids=input_ids, decoder_input_ids=input_ids, labels=input_ids) loss, logits = outputs.loss, outputs.logits model.save_pretrained("bert2bert") model = EncoderDecoderModel.from_pretrained("bert2bert") generated = model.generate(input_ids, decoder_start_token_id=model.config.decoder.pad_token_id)

for i, sample_output in enumerate(generated): print("{}: {}".format(i, tokenizer.decode(sample_output, skip_special_tokens=True)))`

output

However, it returned to such a result.

`Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertLMHeadModel: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias']

Process finished with exit code 0 `

issue

Would you be too kindly to help me find out the reason why it returned word 'as' ? Much thanks! Besides, as a newbie, would it be possible if I could use BERT as encoder and Transformer as Decoder in this EncoderDecoderModel? I would be too grateful if you could help me out!

LysandreJik commented 3 years ago

Hi! You're using two bert-base-uncased as encoder/decoders. This is possible, but you'll need to train your resulting encoder-decoder model on a downstream task in order to obtain coherent results.

The bert-base-uncased checkpoint is originally from an encoder-only setup.

If I may recommend some notebooks/documentation:

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.