salesforce / BLIP

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
BSD 3-Clause "New" or "Revised" License
4.62k stars 615 forks source link

size mismatch for bert.embeddings.word_embeddings.weight #128

Open LianghuiGuo opened 1 year ago

LianghuiGuo commented 1 year ago

Hello, I have trained a Bert with vocab_size 21128, and I noticed that in BLIP the vocab_size should be 21130 (including 2 additional tokens:DEC,ENC). However, this difference caused a shape conflict when loading statics from my Bert : "size mismatch for bert.embeddings.word_embeddings.weight: copying a param with shape torch.Size([21128, 768]) from checkpoint, the shape in current model is torch.Size([21130, 768])".

futureisatyourhand commented 8 months ago

Hello, I have trained a Bert with vocab_size 21128, and I noticed that in BLIP the vocab_size should be 21130 (including 2 additional tokens:DEC,ENC). However, this difference caused a shape conflict when loading statics from my Bert : "size mismatch for bert.embeddings.word_embeddings.weight: copying a param with shape torch.Size([21128, 768]) from checkpoint, the shape in current model is torch.Size([21130, 768])".

you can make sure your vocab size of med_config.json is more 2 than vocab_size of bert_config.json. I can train model.