Can't load mt5 model after resizing token embedding

alecoutre1 commented 3 years ago

Environment info

transformers version: 4.0.1
Platform: macOS-10.15.6-x86_64-i386-64bit
Python version: 3.8.3
PyTorch version (GPU?): 1.7.0 (False)
Tensorflow version (GPU?): not installed (NA)
Using GPU in script?: No
Using distributed or parallel set-up in script?: No

@patrickvonplaten

Description

I am having issues to reload a saved mt5 model when the token embedding has been resized. This error doesn't appear with the t5 model. I receive the following error :

Error(s) in loading state_dict for MT5ForConditionalGeneration: size mismatch for lm_head.weight: copying a param with shape torch.Size([250112, 768]) from checkpoint, the shape in current model is torch.Size([250102, 768]).

Is there something different between the models that I am missing ?

To reproduce :

from transformers import MT5ForConditionalGeneration, AutoTokenizer, T5ForConditionalGeneration

model_class = MT5ForConditionalGeneration #T5ForConditionalGeneration
model_path = "google/mt5-base" # "t5-base"

model = model_class.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)

tokenizer.add_tokens(['<tok1>', '<tok2>'])
model.resize_token_embeddings(len(tokenizer))

SAVING_PATH = "/tmp/test_model"

model.save_pretrained(SAVING_PATH)
tokenizer.save_pretrained(SAVING_PATH)

new_model = model_class.from_pretrained(SAVING_PATH)

patrickvonplaten commented 3 years ago

Hey @alecoutre1 I think this was fixed very recently.

I cannot reproduce your error on master -> could you try to pip install the master version and see if the error persists?

pip install git+https://github.com/huggingface/transformers

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

huggingface / transformers