Closed qute012 closed 3 years ago
Hi @qute012
The tie_weights
method ties all the weights of the encoder and decoder including the embedding. For this to work, the encoder and decoder need to be the same model (same class) i.e either BERT2BERT or ROBERTA2ROBERTA2 and with same size.
If i want to use different tokenizer between encoder and decoder inputs, does tie function ignore sharing embedding such as strict option?
no, it does not ignore sharing embedding in that case because as I wrote above it expects both encoder and decoder to be the same model so implicitly assumes that the tokenizer will also be the same.
But if that's what you want to do you could manually untie the embeddings or just re-initialize both of them so they won't be shared/tied.
Thanks to reply @patil-suraj
For example, is it right that encoder's embedding weights can be adjusted by decoder's input?
Then if i want to untie, should i remove or comment out below code manually? https://github.com/huggingface/transformers/blob/52166f672ed337934d90cc525c226d93209e0923/src/transformers/models/encoder_decoder/modeling_encoder_decoder.py#L183
do you have any plan to add parameter that choosing tie function in EncoderDecoderModel class? I know bert2bert is better performance than random decoder's embedding weights, but it requires to extending for experiment newly when each encoder and decoder use different vocabulary. If you okay, i will PR.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
First, thank you for great works!
If i want to use different tokenizer between encoder and decoder inputs, does tie function ignore sharing embedding such as strict option?