huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.63k stars 26.92k forks source link

Unable to load mT5 with MT5Model.from_pretrained("google/mt5-small") #14609

Closed Xuanfang1121 closed 2 years ago

Xuanfang1121 commented 2 years ago

Environment info

Who can help

@LysandreJik@patrickvonplaten, @patil-suraj

Information

Model I am using (mt5): code

from transformers import MT5Model from transformers import T5Tokenizer

tokenizer = T5Tokenizer.from_pretrained("google/mt5-small") model = MT5Model.from_pretrained("google/mt5-small")

When I use the above code, the following error appears: Some weights of the model checkpoint at google/mt5-small were not used when initializing MT5Model: ['lm_head.weight']

https://huggingface.co/docs/transformers/model_doc/mt5

How to load the mt5-base(or mt5-small) model correctly? thanks

patil-suraj commented 2 years ago

Hi @Xuanfang1121 !

You are loading MT5Model , which loads the base model without the lm_head, so when loading the pre-trained weights the lm_head.weight is ignored.

If you want to load the model for conditional generation training or inference then you should use the MT5ForConditionalGeneration class which has the lm_head.

Xuanfang1121 commented 2 years ago

Hi @Xuanfang1121 !

You are loading MT5Model , which loads the base model without the lm_head, so when loading the pre-trained weights the lm_head.weight is ignored.

If you want to load the model for conditional generation training or inference then you should use the MT5ForConditionalGeneration class which has the lm_head.

ok,thanks