mrwu-mac / DIFNet

This repository is for the paper ``DIFNet: Boosting Visual Information Flow for Image Captioning'' (CVPR 2022).
BSD 3-Clause "New" or "Revised" License
20 stars 7 forks source link

about pretrained model #1

Closed wanboyang closed 2 years ago

wanboyang commented 2 years ago

when I use pretrained model DIFNet.pth, some errors reports: RuntimeError: Error(s) in loading state_dict for Difnet_LRP: Missing key(s) in state_dict: "decoder.layers.0.pwff.layer_norm1.weight", "decoder.layers.0.pwff.layer_norm1.bias", "decoder.layers.1.pwff.layer_norm1.weight", "decoder.layers.1.pwff.layer_norm1.bias", "decoder.layers.2.pwff.layer_norm1.weight", "decoder.layers.2.pwff.layer_norm1.bias". It seems a mismatch between Difnet_LRP architecture code and the pretrained model weights.

wanboyang commented 2 years ago

微信图片_20220620142529 and I retrain the Difnet_LRP by using the default setting. The performance is lower than the one reported in this paper.

mrwu-mac commented 2 years ago

when I use pretrained model DIFNet.pth, some errors reports: RuntimeError: Error(s) in loading state_dict for Difnet_LRP: Missing key(s) in state_dict: "decoder.layers.0.pwff.layer_norm1.weight", "decoder.layers.0.pwff.layer_norm1.bias", "decoder.layers.1.pwff.layer_norm1.weight", "decoder.layers.1.pwff.layer_norm1.bias", "decoder.layers.2.pwff.layer_norm1.weight", "decoder.layers.2.pwff.layer_norm1.bias". It seems a mismatch between Difnet_LRP architecture code and the pretrained model weights.

We are sorry for this, we have updated the code. Note that difnet_lrp use a decoder as same as the base_lrp.

zhangxuying1004 commented 2 years ago

when I use pretrained model DIFNet.pth, some errors reports: RuntimeError: Error(s) in loading state_dict for Difnet_LRP: Missing key(s) in state_dict: "decoder.layers.0.pwff.layer_norm1.weight", "decoder.layers.0.pwff.layer_norm1.bias", "decoder.layers.1.pwff.layer_norm1.weight", "decoder.layers.1.pwff.layer_norm1.bias", "decoder.layers.2.pwff.layer_norm1.weight", "decoder.layers.2.pwff.layer_norm1.bias". It seems a mismatch between Difnet_LRP architecture code and the pretrained model weights.

Our code has been updated, and the test results of our pre-trained difnet_lrp are as follows: image

wanboyang commented 2 years ago

Thanks for replying