tjpulkl / CDGNet

49 stars 5 forks source link

size mismatch during load_sate_dict() #4

Open hannwi opened 2 years ago

hannwi commented 2 years ago

Hello, I appreciate your awesome work. I want to try evaluation, but there's an error while calling load_state_dict() in evaluate.py the error message is as below:

size mismatch for layer6.conv2.0.weight: copying a param with shape torch.Size([48, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([48, 256, 3, 3]).

It seems that the dimensions in the pretrained model 'LIP_epoch_149.pth' and the constructed model from LIPDataSet() are different in some layers. Could you check this issue?

Thank you!

tjpulkl commented 2 years ago

I download the project and rerun the code, I have not encount the issue you mentioned. please make sure that you create the folder of snapshots and put the trained model in this folder. image

hannwi commented 2 years ago

Thanks for your reply! I downloaded the pretrained model from Google Drive and put it on the folder as below: 스크린샷 2022-08-17 오후 4 24 49 is there any chance that the model on Google Drive is different from the one on Baidu Drive?

tjpulkl commented 2 years ago
font{
    line-height: 1.6;
}
ul,ol{
    padding-left: 20px;
    list-style-position: inside;
}

    Hi, I understand what you have done to the code.
    When you train the network you should load the pretrained model which termed as resnet101-imagenet.pth.The aformethioned pretrained model is from the training on imagenet dataset for image classification.Usually, when you train the network for segmentation, we can first load the pretrained classification model to improve the performance, this is a common practice.The final trained model for segmentation on LIP is LIP_epoch_149.pth.All in all,  resnet101-imagenet.pth is the pretrained model for training the network, LIP_epoch_149.pthis our final trained model to evaluate.

On 8/17/2022 ***@***.***> wrote: 

Thanks for your reply!

I downloaded the pretrained model from Google Drive and put it on the folder as below:

is there any chance that the model on Google Drive is different from the one on Baidu Drive?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

olream commented 1 year ago

when i do evaluation,I have a similar problem. Here is my code.

model = Res_Deeplab(num_classes=20)
print(f'parms: {sum(p.numel() for p in model.parameters() if p.requires_grad)}')

state_dict = model.state_dict().copy()
state_dict_old = torch.load('CDGNet/LIP_epoch_149.pth')

for key, nkey in zip(state_dict_old.keys(), state_dict.keys()):
    if key != nkey:
        # remove the 'module.' in the 'key'
        state_dict[key[7:]] = deepcopy(state_dict_old[key])
    else:
        state_dict[key] = deepcopy(state_dict_old[key])

model.load_state_dict(state_dict)

return “ runtimeError OrderedDict mutated during iteration”

so,i modify the code

model = Res_Deeplab(num_classes=20)
print(f'parms: {sum(p.numel() for p in model.parameters() if p.requires_grad)}')

state_dict = model.state_dict().copy()
state_dict_new = collections.OrderedDict()
state_dict_old = torch.load('CDGNet/LIP_epoch_149.pth')

for key, nkey in zip(state_dict_old.keys(), state_dict.keys()):
    if key != nkey:
        # remove the 'module.' in the 'key'
        state_dict_new[key[7:]] = deepcopy(state_dict_old[key])
    else:
        state_dict_new[key] = deepcopy(state_dict_old[key])

model.load_state_dict(state_dict_new)

return RuntimeError: Error(s) in loading state_dict for ResNet: Missing key(s) in state_dict: "layer5.stages.0.2.0.weight", "layer5.stages.0.2.0.bias", "layer5.stages.0.2.0.running_mean", "layer5.stages.0.2.0.running_var", "layer5.stages.1.2.0.weight", "layer5.stages.1.2.0.bias", "layer5.stages.1.2.0.running_mean", "layer5.stages.1.2.0.running_var", "layer5.stages.2.2.0.weight", "layer5.stages.2.2.0.bias", "layer5.stages.2.2.0.running_mean", "layer5.stages.2.2.0.running_var", "layer5.stages.3.2.0.weight", "layer5.stages.3.2.0.bias", "layer5.stages.3.2.0.running_mean", "layer5.stages.3.2.0.running_var", "layer5.bottleneck.1.0.weight", "layer5.bottleneck.1.0.bias", "layer5.bottleneck.1.0.running_mean", "layer5.bottleneck.1.0.running_var", "edge_layer.conv1.1.0.weight", "edge_layer.conv1.1.0.bias", "edge_layer.conv1.1.0.running_mean", "edge_layer.conv1.1.0.running_var", "edge_layer.conv2.1.0.weight", "edge_layer.conv2.1.0.bias", "edge_layer.conv2.1.0.running_mean", "edge_layer.conv2.1.0.running_var", "edge_layer.conv3.1.0.weight", "edge_layer.conv3.1.0.bias", "edge_layer.conv3.1.0.running_mean", "edge_layer.conv3.1.0.running_var", "layer6.conv1.1.0.weight", "layer6.conv1.1.0.bias", "layer6.conv1.1.0.running_mean", "layer6.conv1.1.0.running_var", "layer6.conv2.1.0.weight", "layer6.conv2.1.0.bias", "layer6.conv2.1.0.running_mean", "layer6.conv2.1.0.running_var", "layer6.conv3.1.0.weight", "layer6.conv3.1.0.bias", "layer6.conv3.1.0.running_mean", "layer6.conv3.1.0.running_var", "layer6.conv3.3.0.weight", "layer6.conv3.3.0.bias", "layer6.conv3.3.0.running_mean", "layer6.conv3.3.0.running_var", "layer6.addCAM.0.weight", "layer6.addCAM.1.0.weight", "layer6.addCAM.1.0.bias", "layer6.addCAM.1.0.running_mean", "layer6.addCAM.1.0.running_var", "layer7.1.0.weight", "layer7.1.0.bias", "layer7.1.0.running_mean", "layer7.1.0.running_var", "sq4.0.weight", "sq4.1.0.weight", "sq4.1.0.bias", "sq4.1.0.running_mean", "sq4.1.0.running_var", "sq5.0.weight", "sq5.1.0.weight", "sq5.1.0.bias", "sq5.1.0.running_mean", "sq5.1.0.running_var", "f9.0.weight", "f9.1.0.weight", "f9.1.0.bias", "f9.1.0.running_mean", "f9.1.0.running_var", "hwAttention.gamma", "hwAttention.beta", "hwAttention.conv_hgt1.0.weight", "hwAttention.conv_hgt1.1.weight", "hwAttention.conv_hgt1.1.bias", "hwAttention.conv_hgt1.1.running_mean", "hwAttention.conv_hgt1.1.running_var", "hwAttention.conv_hgt2.0.weight", "hwAttention.conv_hgt2.1.weight", "hwAttention.conv_hgt2.1.bias", "hwAttention.conv_hgt2.1.running_mean", "hwAttention.conv_hgt2.1.running_var", "hwAttention.conv_hwPred1.0.weight", "hwAttention.conv_hwPred1.0.bias", "hwAttention.conv_hwPred2.0.weight", "hwAttention.conv_hwPred2.0.bias", "hwAttention.conv_upDim1.0.weight", "hwAttention.conv_upDim1.0.bias", "hwAttention.conv_upDim2.0.weight", "hwAttention.conv_upDim2.0.bias", "hwAttention.cmbFea.0.weight", "hwAttention.cmbFea.1.weight", "hwAttention.cmbFea.1.bias", "hwAttention.cmbFea.1.running_mean", "hwAttention.cmbFea.1.running_var", "L.weight", "L.bias". Unexpected key(s) in state_dict: "layer5.stages.0.2.weight", "layer5.stages.0.2.bias", "layer5.stages.0.2.running_mean", "layer5.stages.0.2.running_var", "layer5.stages.1.2.weight", "layer5.stages.1.2.bias", "layer5.stages.1.2.running_mean", "layer5.stages.1.2.running_var", "layer5.stages.2.2.weight", "layer5.stages.2.2.bias", "layer5.stages.2.2.running_mean", "layer5.stages.2.2.running_var", "layer5.stages.3.2.weight", "layer5.stages.3.2.bias", "layer5.stages.3.2.running_mean", "layer5.stages.3.2.running_var", "layer5.bottleneck.1.weight", "layer5.bottleneck.1.bias", "layer5.bottleneck.1.running_mean", "layer5.bottleneck.1.running_var", "edge_layer.conv1.1.weight", "edge_layer.conv1.1.bias", "edge_layer.conv1.1.running_mean", "edge_layer.conv1.1.running_var", "edge_layer.conv2.1.weight", "edge_layer.conv2.1.bias", "edge_layer.conv2.1.running_mean", "edge_layer.conv2.1.running_var", "edge_layer.conv3.1.weight", "edge_layer.conv3.1.bias", "edge_layer.conv3.1.running_mean", "edge_layer.conv3.1.running_var", "layer6.conv1.1.weight", "layer6.conv1.1.bias", "layer6.conv1.1.running_mean", "layer6.conv1.1.running_var", "layer6.conv2.1.weight", "layer6.conv2.1.bias", "layer6.conv2.1.running_mean", "layer6.conv2.1.running_var", "layer6.conv3.1.weight", "layer6.conv3.1.bias", "layer6.conv3.1.running_mean", "layer6.conv3.1.running_var", "layer6.conv3.3.weight", "layer6.conv3.3.bias", "layer6.conv3.3.running_mean", "layer6.conv3.3.running_var", "layer7.1.weight", "layer7.1.bias", "layer7.1.running_mean", "layer7.1.running_var". size mismatch for layer6.conv2.0.weight: copying a param with shape torch.Size([48, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([48, 256, 3, 3]). size mismatch for layer6.conv3.0.weight: copying a param with shape torch.Size([256, 304, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 304, 3, 3]). size mismatch for layer7.0.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 1024, 3, 3]).

birdortyedi commented 1 year ago

when i do evaluation,I have a similar problem. Here is my code.

model = Res_Deeplab(num_classes=20)
print(f'parms: {sum(p.numel() for p in model.parameters() if p.requires_grad)}')

state_dict = model.state_dict().copy()
state_dict_old = torch.load('CDGNet/LIP_epoch_149.pth')

for key, nkey in zip(state_dict_old.keys(), state_dict.keys()):
    if key != nkey:
        # remove the 'module.' in the 'key'
        state_dict[key[7:]] = deepcopy(state_dict_old[key])
    else:
        state_dict[key] = deepcopy(state_dict_old[key])

model.load_state_dict(state_dict)

return “ runtimeError OrderedDict mutated during iteration”

so,i modify the code

model = Res_Deeplab(num_classes=20)
print(f'parms: {sum(p.numel() for p in model.parameters() if p.requires_grad)}')

state_dict = model.state_dict().copy()
state_dict_new = collections.OrderedDict()
state_dict_old = torch.load('CDGNet/LIP_epoch_149.pth')

for key, nkey in zip(state_dict_old.keys(), state_dict.keys()):
    if key != nkey:
        # remove the 'module.' in the 'key'
        state_dict_new[key[7:]] = deepcopy(state_dict_old[key])
    else:
        state_dict_new[key] = deepcopy(state_dict_old[key])

model.load_state_dict(state_dict_new)

return RuntimeError: Error(s) in loading state_dict for ResNet: Missing key(s) in state_dict: "layer5.stages.0.2.0.weight", "layer5.stages.0.2.0.bias", "layer5.stages.0.2.0.running_mean", "layer5.stages.0.2.0.running_var", "layer5.stages.1.2.0.weight", "layer5.stages.1.2.0.bias", "layer5.stages.1.2.0.running_mean", "layer5.stages.1.2.0.running_var", "layer5.stages.2.2.0.weight", "layer5.stages.2.2.0.bias", "layer5.stages.2.2.0.running_mean", "layer5.stages.2.2.0.running_var", "layer5.stages.3.2.0.weight", "layer5.stages.3.2.0.bias", "layer5.stages.3.2.0.running_mean", "layer5.stages.3.2.0.running_var", "layer5.bottleneck.1.0.weight", "layer5.bottleneck.1.0.bias", "layer5.bottleneck.1.0.running_mean", "layer5.bottleneck.1.0.running_var", "edge_layer.conv1.1.0.weight", "edge_layer.conv1.1.0.bias", "edge_layer.conv1.1.0.running_mean", "edge_layer.conv1.1.0.running_var", "edge_layer.conv2.1.0.weight", "edge_layer.conv2.1.0.bias", "edge_layer.conv2.1.0.running_mean", "edge_layer.conv2.1.0.running_var", "edge_layer.conv3.1.0.weight", "edge_layer.conv3.1.0.bias", "edge_layer.conv3.1.0.running_mean", "edge_layer.conv3.1.0.running_var", "layer6.conv1.1.0.weight", "layer6.conv1.1.0.bias", "layer6.conv1.1.0.running_mean", "layer6.conv1.1.0.running_var", "layer6.conv2.1.0.weight", "layer6.conv2.1.0.bias", "layer6.conv2.1.0.running_mean", "layer6.conv2.1.0.running_var", "layer6.conv3.1.0.weight", "layer6.conv3.1.0.bias", "layer6.conv3.1.0.running_mean", "layer6.conv3.1.0.running_var", "layer6.conv3.3.0.weight", "layer6.conv3.3.0.bias", "layer6.conv3.3.0.running_mean", "layer6.conv3.3.0.running_var", "layer6.addCAM.0.weight", "layer6.addCAM.1.0.weight", "layer6.addCAM.1.0.bias", "layer6.addCAM.1.0.running_mean", "layer6.addCAM.1.0.running_var", "layer7.1.0.weight", "layer7.1.0.bias", "layer7.1.0.running_mean", "layer7.1.0.running_var", "sq4.0.weight", "sq4.1.0.weight", "sq4.1.0.bias", "sq4.1.0.running_mean", "sq4.1.0.running_var", "sq5.0.weight", "sq5.1.0.weight", "sq5.1.0.bias", "sq5.1.0.running_mean", "sq5.1.0.running_var", "f9.0.weight", "f9.1.0.weight", "f9.1.0.bias", "f9.1.0.running_mean", "f9.1.0.running_var", "hwAttention.gamma", "hwAttention.beta", "hwAttention.conv_hgt1.0.weight", "hwAttention.conv_hgt1.1.weight", "hwAttention.conv_hgt1.1.bias", "hwAttention.conv_hgt1.1.running_mean", "hwAttention.conv_hgt1.1.running_var", "hwAttention.conv_hgt2.0.weight", "hwAttention.conv_hgt2.1.weight", "hwAttention.conv_hgt2.1.bias", "hwAttention.conv_hgt2.1.running_mean", "hwAttention.conv_hgt2.1.running_var", "hwAttention.conv_hwPred1.0.weight", "hwAttention.conv_hwPred1.0.bias", "hwAttention.conv_hwPred2.0.weight", "hwAttention.conv_hwPred2.0.bias", "hwAttention.conv_upDim1.0.weight", "hwAttention.conv_upDim1.0.bias", "hwAttention.conv_upDim2.0.weight", "hwAttention.conv_upDim2.0.bias", "hwAttention.cmbFea.0.weight", "hwAttention.cmbFea.1.weight", "hwAttention.cmbFea.1.bias", "hwAttention.cmbFea.1.running_mean", "hwAttention.cmbFea.1.running_var", "L.weight", "L.bias". Unexpected key(s) in state_dict: "layer5.stages.0.2.weight", "layer5.stages.0.2.bias", "layer5.stages.0.2.running_mean", "layer5.stages.0.2.running_var", "layer5.stages.1.2.weight", "layer5.stages.1.2.bias", "layer5.stages.1.2.running_mean", "layer5.stages.1.2.running_var", "layer5.stages.2.2.weight", "layer5.stages.2.2.bias", "layer5.stages.2.2.running_mean", "layer5.stages.2.2.running_var", "layer5.stages.3.2.weight", "layer5.stages.3.2.bias", "layer5.stages.3.2.running_mean", "layer5.stages.3.2.running_var", "layer5.bottleneck.1.weight", "layer5.bottleneck.1.bias", "layer5.bottleneck.1.running_mean", "layer5.bottleneck.1.running_var", "edge_layer.conv1.1.weight", "edge_layer.conv1.1.bias", "edge_layer.conv1.1.running_mean", "edge_layer.conv1.1.running_var", "edge_layer.conv2.1.weight", "edge_layer.conv2.1.bias", "edge_layer.conv2.1.running_mean", "edge_layer.conv2.1.running_var", "edge_layer.conv3.1.weight", "edge_layer.conv3.1.bias", "edge_layer.conv3.1.running_mean", "edge_layer.conv3.1.running_var", "layer6.conv1.1.weight", "layer6.conv1.1.bias", "layer6.conv1.1.running_mean", "layer6.conv1.1.running_var", "layer6.conv2.1.weight", "layer6.conv2.1.bias", "layer6.conv2.1.running_mean", "layer6.conv2.1.running_var", "layer6.conv3.1.weight", "layer6.conv3.1.bias", "layer6.conv3.1.running_mean", "layer6.conv3.1.running_var", "layer6.conv3.3.weight", "layer6.conv3.3.bias", "layer6.conv3.3.running_mean", "layer6.conv3.3.running_var", "layer7.1.weight", "layer7.1.bias", "layer7.1.running_mean", "layer7.1.running_var". size mismatch for layer6.conv2.0.weight: copying a param with shape torch.Size([48, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([48, 256, 3, 3]). size mismatch for layer6.conv3.0.weight: copying a param with shape torch.Size([256, 304, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 304, 3, 3]). size mismatch for layer7.0.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 1024, 3, 3]).

Same problem for me.

ashwinvaswani commented 1 year ago

@tjpulkl I am facing the same problem. What could be the reason? Please advice!