Open wuhuqifeixzy opened 1 month ago
@wuhuqifeixzy Please properly initialize the corresponding large language model. BITA uses the "facebook/opt-2.7b" model. Please check if the value for opt_model in the "bita_caption_opt2.7b.yaml" file is set to: "facebook/opt-2.7b".
推理过程中,在bita_caption_opt2.7b.yaml文件中,将
model: arch: caption_rsicd_opt2.7b load_finetuned: True finetuned: "RSICD\checkpoint_best.pth"
加载RSICD\checkpoint_best.pth遇到模型大小不匹配问题
RuntimeError: Error(s) in loading state_dict for Blip2OPT: size mismatch for opt_model.lm_head.weight: copying a param with shape torch.Size([50272, 2560]) from checkpoint, the shape in current model is torch.Size([30522, 768]). size mismatch for opt_proj.weight: copying a param with shape torch.Size([2560, 768]) from checkpoint, the shape in current model is torch.Size([768, 768]). size mismatch for opt_proj.bias: copying a param with shape torch.Size([2560]) from checkpoint, the shape in current model is torch.Size([768]).
使用这种方法可以加载权重,仅加载对应的权重,但推理结果不好,认为可能时加载的权重方式有问题
new_state_dict = {} for key, value in state_dict.items(): if key in self.state_dict() and value.size() == self.state_dict()[key].size(): new_state_dict[key] = value msg = self.load_state_dict(new_state_dict, strict=False)
咨询一下各位这步权重文件应如何加载