Closed Xuanfang1121 closed 2 years ago
Hi @Xuanfang1121 !
You are loading MT5Model
, which loads the base model without the lm_head
, so when loading the pre-trained weights the lm_head.weight
is ignored.
If you want to load the model for conditional generation training or inference then you should use the MT5ForConditionalGeneration
class which has the lm_head
.
Hi @Xuanfang1121 !
You are loading
MT5Model
, which loads the base model without thelm_head
, so when loading the pre-trained weights thelm_head.weight
is ignored.If you want to load the model for conditional generation training or inference then you should use the
MT5ForConditionalGeneration
class which has thelm_head
.
ok,thanks
Environment info
transformers
version: 4.2.2Who can help
@LysandreJik@patrickvonplaten, @patil-suraj
Information
Model I am using (mt5): code
from transformers import MT5Model from transformers import T5Tokenizer
tokenizer = T5Tokenizer.from_pretrained("google/mt5-small") model = MT5Model.from_pretrained("google/mt5-small")
When I use the above code, the following error appears: Some weights of the model checkpoint at google/mt5-small were not used when initializing MT5Model: ['lm_head.weight']
https://huggingface.co/docs/transformers/model_doc/mt5
How to load the mt5-base(or mt5-small) model correctly? thanks