Open Muennighoff opened 2 weeks ago
Just realized i get the below warning with Salesforce/blip-image-captioning-large ; i think i already ran results for it, but they're probably random in that case; maybe someone could check the results
Running evaluation for Salesforce/blip-image-captioning-large Some weights of BlipForImageTextRetrieval were not initialized from the model checkpoint at Salesforce/blip-i mage-captioning-large and are newly initialized: ['itm_head.bias', 'itm_head.weight', 'text_encoder.embedding s.LayerNorm.bias', 'text_encoder.embeddings.LayerNorm.weight', 'text_encoder.embeddings.position_embeddings.w eight', 'text_encoder.embeddings.word_embeddings.weight', 'text_encoder.encoder.layer.0.attention.output.Laye rNorm.bias', 'text_encoder.encoder.layer.0.attention.output.LayerNorm.weight', 'text_encoder.encoder.layer.0. attention.output.dense.bias', 'text_encoder.encoder.layer.0.attention.output.dense.weight', 'text_encoder.enc oder.layer.0.attention.self.key.bias', 'text_encoder.encoder.layer.0.attention.self.key.weight', 'text_encode r.encoder.layer.0.attention.self.query.bias', 'text_encoder.encoder.layer.0.attention.self.query.weight', 'te xt_encoder.encoder.layer.0.attention.self.value.bias', 'text_encoder.encoder.layer.0.attention.self.value.wei ght', 'text_encoder.encoder.layer.0.crossattention.output.LayerNorm.bias', 'text_encoder.encoder.layer.0.cros sattention.output.LayerNorm.weight', 'text_encoder.encoder.layer.0.crossattention.output.dense.bias', 'text_e ncoder.encoder.layer.0.crossattention.output.dense.weight', 'text_encoder.encoder.layer.0.crossattention.self .key.bias', 'text_encoder.encoder.layer.0.crossattention.self.key.weight', 'text_encoder.encoder.layer.0.cros sattention.self.query.bias', 'text_encoder.encoder.layer.0.crossattention.self.query.weight', 'text_encoder.e ncoder.layer.0.crossattention.self.value.bias', 'text_encoder.encoder.layer.0.crossattention.self.value.weigh t', 'text_encoder.encoder.layer.0.intermediate.dense.bias', 'text_encoder.encoder.layer.0.intermediate.dense. weight', 'text_encoder.encoder.layer.0.output.LayerNorm.bias', 'text_encoder.encoder.layer.0.output.LayerNorm .weight', 'text_encoder.encoder.layer.0.output.dense.bias', 'text_encoder.encoder.layer.0.output.dense.weight ', 'text_encoder.encoder.layer.1.attention.output.LayerNorm.bias', 'text_encoder.encoder.layer.1.attention.ou tput.LayerNorm.weight', 'text_encoder.encoder.layer.1.attention.output.dense.bias', 'text_encoder.encoder.lay er.1.attention.output.dense.weight', 'text_encoder.encoder.layer.1.attention.self.key.bias', 'text_encoder.en coder.layer.1.attention.self.key.weight', 'text_encoder.encoder.layer.1.attention.self.query.bias', 'text_enc oder.encoder.layer.1.attention.self.query.weight', 'text_encoder.encoder.layer.1.attention.self.value.bias', 'text_encoder.encoder.layer.1.attention.self.value.weight', 'text_encoder.encoder.layer.1.crossattention.outp ut.LayerNorm.bias', 'text_encoder.encoder.layer.1.crossattention.output.LayerNorm.weight', 'text_encoder.enco der.layer.1.crossattention.output.dense.bias', 'text_encoder.encoder.layer.1.crossattention.output.dense.weig ht', 'text_encoder.encoder.layer.1.crossattention.self.key.bias', 'text_encoder.encoder.layer.1.crossattentio n.self.key.weight', 'text_encoder.encoder.layer.1.crossattention.self.query.bias', 'text_encoder.encoder.laye r.1.crossattention.self.query.weight', 'text_encoder.encoder.layer.1.crossattention.self.value.bias', 'text_e ncoder.encoder.layer.1.crossattention.self.value.weight', 'text_encoder.encoder.layer.1.intermediate.dense.bi as', 'text_encoder.encoder.layer.1.intermediate.dense.weight', 'text_encoder.encoder.layer.1.output.LayerNorm .bias', 'text_encoder.encoder.layer.1.output.LayerNorm.weight', 'text_encoder.encoder.layer.1.output.dense.bi as', 'text_encoder.encoder.layer.1.output.dense.weight', 'text_encoder.encoder.layer.10.attention.output.Laye rNorm.bias', 'text_encoder.encoder.layer.10.attention.output.LayerNorm.weight', 'text_encoder.encoder.layer.1 0.attention.output.dense.bias', 'text_encoder.encoder.layer.10.attention.output.dense.weight', 'text_encoder. encoder.layer.10.attention.self.key.bias', 'text_encoder.encoder.layer.10.attention.self.key.weight', 'text_e ncoder.encoder.layer.10.attention.self.query.bias', 'text_encoder.encoder.layer.10.attention.self.query.weig$
@Jamie-Stirling
Just realized i get the below warning with Salesforce/blip-image-captioning-large ; i think i already ran results for it, but they're probably random in that case; maybe someone could check the results
Running evaluation for Salesforce/blip-image-captioning-large Some weights of BlipForImageTextRetrieval were not initialized from the model checkpoint at Salesforce/blip-i mage-captioning-large and are newly initialized: ['itm_head.bias', 'itm_head.weight', 'text_encoder.embedding s.LayerNorm.bias', 'text_encoder.embeddings.LayerNorm.weight', 'text_encoder.embeddings.position_embeddings.w eight', 'text_encoder.embeddings.word_embeddings.weight', 'text_encoder.encoder.layer.0.attention.output.Laye rNorm.bias', 'text_encoder.encoder.layer.0.attention.output.LayerNorm.weight', 'text_encoder.encoder.layer.0. attention.output.dense.bias', 'text_encoder.encoder.layer.0.attention.output.dense.weight', 'text_encoder.enc oder.layer.0.attention.self.key.bias', 'text_encoder.encoder.layer.0.attention.self.key.weight', 'text_encode r.encoder.layer.0.attention.self.query.bias', 'text_encoder.encoder.layer.0.attention.self.query.weight', 'te xt_encoder.encoder.layer.0.attention.self.value.bias', 'text_encoder.encoder.layer.0.attention.self.value.wei ght', 'text_encoder.encoder.layer.0.crossattention.output.LayerNorm.bias', 'text_encoder.encoder.layer.0.cros sattention.output.LayerNorm.weight', 'text_encoder.encoder.layer.0.crossattention.output.dense.bias', 'text_e ncoder.encoder.layer.0.crossattention.output.dense.weight', 'text_encoder.encoder.layer.0.crossattention.self .key.bias', 'text_encoder.encoder.layer.0.crossattention.self.key.weight', 'text_encoder.encoder.layer.0.cros sattention.self.query.bias', 'text_encoder.encoder.layer.0.crossattention.self.query.weight', 'text_encoder.e ncoder.layer.0.crossattention.self.value.bias', 'text_encoder.encoder.layer.0.crossattention.self.value.weigh t', 'text_encoder.encoder.layer.0.intermediate.dense.bias', 'text_encoder.encoder.layer.0.intermediate.dense. weight', 'text_encoder.encoder.layer.0.output.LayerNorm.bias', 'text_encoder.encoder.layer.0.output.LayerNorm .weight', 'text_encoder.encoder.layer.0.output.dense.bias', 'text_encoder.encoder.layer.0.output.dense.weight ', 'text_encoder.encoder.layer.1.attention.output.LayerNorm.bias', 'text_encoder.encoder.layer.1.attention.ou tput.LayerNorm.weight', 'text_encoder.encoder.layer.1.attention.output.dense.bias', 'text_encoder.encoder.lay er.1.attention.output.dense.weight', 'text_encoder.encoder.layer.1.attention.self.key.bias', 'text_encoder.en coder.layer.1.attention.self.key.weight', 'text_encoder.encoder.layer.1.attention.self.query.bias', 'text_enc oder.encoder.layer.1.attention.self.query.weight', 'text_encoder.encoder.layer.1.attention.self.value.bias', 'text_encoder.encoder.layer.1.attention.self.value.weight', 'text_encoder.encoder.layer.1.crossattention.outp ut.LayerNorm.bias', 'text_encoder.encoder.layer.1.crossattention.output.LayerNorm.weight', 'text_encoder.enco der.layer.1.crossattention.output.dense.bias', 'text_encoder.encoder.layer.1.crossattention.output.dense.weig ht', 'text_encoder.encoder.layer.1.crossattention.self.key.bias', 'text_encoder.encoder.layer.1.crossattentio n.self.key.weight', 'text_encoder.encoder.layer.1.crossattention.self.query.bias', 'text_encoder.encoder.laye r.1.crossattention.self.query.weight', 'text_encoder.encoder.layer.1.crossattention.self.value.bias', 'text_e ncoder.encoder.layer.1.crossattention.self.value.weight', 'text_encoder.encoder.layer.1.intermediate.dense.bi as', 'text_encoder.encoder.layer.1.intermediate.dense.weight', 'text_encoder.encoder.layer.1.output.LayerNorm .bias', 'text_encoder.encoder.layer.1.output.LayerNorm.weight', 'text_encoder.encoder.layer.1.output.dense.bi as', 'text_encoder.encoder.layer.1.output.dense.weight', 'text_encoder.encoder.layer.10.attention.output.Laye rNorm.bias', 'text_encoder.encoder.layer.10.attention.output.LayerNorm.weight', 'text_encoder.encoder.layer.1 0.attention.output.dense.bias', 'text_encoder.encoder.layer.10.attention.output.dense.weight', 'text_encoder. encoder.layer.10.attention.self.key.bias', 'text_encoder.encoder.layer.10.attention.self.key.weight', 'text_e ncoder.encoder.layer.10.attention.self.query.bias', 'text_encoder.encoder.layer.10.attention.self.query.weig$