Closed soda-lsq closed 3 years ago
You can initialize an EncoderDecoderModel
with any autoencoding text encoder and any autoregressive text decoder. These can be randomly initialized, or you can start from pre-trained checkpoints.
So yes, it's totally possible to instantiate an EncoderDecoderModel
with a randomly initialized BERT and a pre-trained GPT-2 model, like so:
from transformers import EncoderDecoderModel, BertConfig, BertModel, GPT2LMHeadModel
# randomly initialized BERT
encoder_config = BertConfig()
encoder = BertModel(encoder_config)
# pre-trained GPT-2
decoder = GPT2LMHeadModel.from_pretrained("gpt2")
model = EncoderDecoderModel(encoder=encoder, decoder=decoder)
You can initialize an
EncoderDecoderModel
with any autoencoding text encoder and any autoregressive text decoder. These can be randomly initialized, or you can start from pre-trained checkpoints.So yes, it's totally possible to instantiate an
EncoderDecoderModel
with a randomly initialized BERT and a pre-trained GPT-2 model, like so:from transformers import EncoderDecoderModel, BertConfig, BertModel, GPT2LMHeadModel # randomly initialized BERT encoder_config = BertConfig() encoder = BertModel(encoder_config) # pre-trained GPT-2 decoder = GPT2LMHeadModel.from_pretrained("gpt2") model = EncoderDecoderModel(encoder=encoder, decoder=decoder)
I see! Thank you so much for your kind reply! It means a lot to me!
Hi,
I would like to train an Rnd2GPT model, whose encoder is a randomly initialized transformer encoder and the decoder utilizes the pre-trained GPT2 model. I found that HuggingFace's "Encoder-Decoder Model" could implement the architecture of "Bert2Bert" and "Bert2GPT" models. However, my source input is not a sentence that could be represented directly through the Bert model, it may be better to be encoded through an initialized transformer encoder.
So, I would like to know how can I achieve the Rnd2GPT model through the Hugging Face "Encoder-Decoder Model"?
Very grateful for your help! Thanks!