Loading State Dict - Githubissues

Hello, I am trying to load in the state dict provided in the OneDrive link, but ran into issues due to there being differences between the expected state_dict and the given one. Specifically, the res2net101 checkpoint is failing for me.

Can you please show me what the exact error is? Maybe it's easier for me to find the problem.

Hi, the error is as follows:

RuntimeError: Error(s) in loading state_dict for Res2Net: Missing key(s) in state_dict: "layer1.0.convs.0.weight", "layer1.0.convs.1.weight", "layer1.0.convs.2.weight", "layer1.0.bns.0.weight", "layer1.0.bns.0.bias", "layer1.0.bns.0.running_mean", "layer1.0.bns.0.running_var", "layer1.0.bns.1.weight", "layer1.0.bns.1.bias", "layer1.0.bns.1.running_mean", "layer1.0.bns.1.running_var", "layer1.0.bns.2.weight", "layer1.0.bns.2.bias", "layer1.0.bns.2.running_mean", "layer1.0.bns.2.running_var", "layer1.1.convs.0.weight", "layer1.1.convs.1.weight", "layer1.1.convs.2.weight", "layer1.1.bns.0.weight", "layer1.1.bns.0.bias", "layer1.1.bns.0.running_mean", "layer1.1.bns.0.running_var", "layer1.1.bns.1.weight", "layer1.1.bns.1.bias", "layer1.1.bns.1.running_mean", "layer1.1.bns.1.running_var", "layer1.1.bns.2.weight", "layer1.1.bns.2.bias", "layer1.1.bns.2.running_mean", "layer1.1.bns.2.running_var", "layer1.2.convs.0.weight", "layer1.2.convs.1.weight", "layer1.2.convs.2.weight", "layer1.2.bns.0.weight", "layer1.2.bns.0.bias", "layer1.2.bns.0.running_mean", "layer1.2.bns.0.running_var", "layer1.2.bns.1.weight", "layer1.2.bns.1.bias", "layer1.2.bns.1.running_mean", "layer1.2.bns.1.running_var", "layer1.2.bns.2.weight", "layer1.2.bns.2.bias", "layer1.2.bns.2.running_mean", "layer1.2.bns.2.running_var", "layer2.0.convs.0.weight", "layer2.0.convs.1.weight", "layer2.0.convs.2.weight", "layer2.0.bns.0.weight", "layer2.0.bns.0.bias", "layer2.0.bns.0.running_mean", "layer2.0.bns.0.running_var", "layer2.0.bns.1.weight", "layer2.0.bns.1.bias", "layer2.0.bns.1.running_mean", "layer2.0.bns.1.running_var", "layer2.0.bns.2.weight", "layer2.0.bns.2.bias", "layer2.0.bns.2.running_mean", "layer2.0.bns.2.running_var", "layer2.1.convs.0.weight", "layer2.1.convs.1.weight", "layer2.1.convs.2.weight", "layer2.1.bns.0.weight", "layer2.1.bns.0.bias", "layer2.1.bns.0.running_mean", "layer2.1.bns.0.running_var", "layer2.1.bns.1.weight", "layer2.1.bns.1.bias", "layer2.1.bns.1.running_mean", "layer2.1.bns.1.running_var", "layer2.1.bns.2.weight", "layer2.1.bns.2.bias", "layer2.1.bns.2.running_mean", "layer2.1.bns.2.running_var", "layer2.2.convs.0.weight", "layer2.2.convs.1.weight", "layer2.2.convs.2.weight", "layer2.2.bns.0.weight", "layer2.2.bns.0.bias", "layer2.2.bns.0.running_mean", "layer2.2.bns.0.running_var", "layer2.2.bns.1.weight", "layer2.2.bns.1.bias", "layer2.2.bns.1.running_mean", "layer2.2.bns.1.running_var", "layer2.2.bns.2.weight", "layer2.2.bns.2.bias", "layer2.2.bns.2.running_mean", "layer2.2.bns.2.running_var", "layer2.3.convs.0.weight", "layer2.3.convs.1.weight", "layer2.3.convs.2.weight", "layer2.3.bns.0.weight", "layer2.3.bns.0.bias", "layer2.3.bns.0.running_mean", "layer2.3.bns.0.running_var", "layer2.3.bns.1.weight", "layer2.3.bns.1.bias", "layer2.3.bns.1.running_mean", "layer2.3.bns.1.running_var", "layer2.3.bns.2.weight", "layer2.3.bns.2.bias", "layer2.3.bns.2.running_mean", "layer2.3.bns.2.running_var", "layer3.0.convs.0.weight", "layer3.0.convs.1.weight", "layer3.0.convs.2.weight", "layer3.0.bns.0.weight", "layer3.0.bns.0.bias", "layer3.0.bns.0.running_mean", "layer3.0.bns.0.running_var", "layer3.0.bns.1.weight", "layer3.0.bns.1.bias", "layer3.0.bns.1.running_mean", "layer3.0.bns.1.running_var", "layer3.0.bns.2.weight", "layer3.0.bns.2.bias", "layer3.0.bns.2.running_mean", "layer3.0.bns.2.running_var", "layer3.1.convs.0.weight", "layer3.1.convs.1.weight", "layer3.1.convs.2.weight", "layer3.1.bns.0.weight", "layer3.1.bns.0.bias", "layer3.1.bns.0.running_mean", "layer3.1.bns.0.running_var", "layer3.1.bns.1.weight", "layer3.1.bns.1.bias", "layer3.1.bns.1.running_mean", "layer3.1.bns.1.running_var", "layer3.1.bns.2.weight", "layer3.1.bns.2.bias", "layer3.1.bns.2.running_mean", "layer3.1.bns.2.running_var", "layer3.2.convs.0.weight", "layer3.2.convs.1.weight", "layer3.2.convs.2.weight", "layer3.2.bns.0.weight", "layer3.2.bns.0.bias", "layer3.2.bns.0.running_mean", "layer3.2.bns.0.running_var", "layer3.2.bns.1.weight", "layer3.2.bns.1.bias", "layer3.2.bns.1.running_mean", "layer3.2.bns.1.running_var", "layer3.2.bns.2.weight", "layer3.2.bns.2.bias", "layer3.2.bns.2.running_mean", "layer3.2.bns.2.running_var", "layer3.3.convs.0.weight", "layer3.3.convs.1.weight", "layer3.3.convs.2.weight", "layer3.3.bns.0.weight", "layer3.3.bns.0.bias", "layer3.3.bns.0.running_mean", "layer3.3.bns.0.running_var", "layer3.3.bns.1.weight", "layer3.3.bns.1.bias", "layer3.3.bns.1.running_mean", "layer3.3.bns.1.running_var", "layer3.3.bns.2.weight", "layer3.3.bns.2.bias", "layer3.3.bns.2.running_mean", "layer3.3.bns.2.running_var", "layer3.4.convs.0.weight", "layer3.4.convs.1.weight", "layer3.4.convs.2.weight", "layer3.4.bns.0.weight", "layer3.4.bns.0.bias", "layer3.4.bns.0.running_mean", "layer3.4.bns.0.running_var", "layer3.4.bns.1.weight", "layer3.4.bns.1.bias", "layer3.4.bns.1.running_mean", "layer3.4.bns.1.running_var", "layer3.4.bns.2.weight", "layer3.4.bns.2.bias", "layer3.4.bns.2.running_mean", "layer3.4.bns.2.running_var", "layer3.5.convs.0.weight", "layer3.5.convs.1.weight", "layer3.5.convs.2.weight", "layer3.5.bns.0.weight", "layer3.5.bns.0.bias", "layer3.5.bns.0.running_mean", "layer3.5.bns.0.running_var", "layer3.5.bns.1.weight", "layer3.5.bns.1.bias", "layer3.5.bns.1.running_mean", "layer3.5.bns.1.running_var", "layer3.5.bns.2.weight", "layer3.5.bns.2.bias", "layer3.5.bns.2.running_mean", "layer3.5.bns.2.running_var", "layer3.6.convs.0.weight", "layer3.6.convs.1.weight", "layer3.6.convs.2.weight", "layer3.6.bns.0.weight", "layer3.6.bns.0.bias", "layer3.6.bns.0.running_mean", "layer3.6.bns.0.running_var", "layer3.6.bns.1.weight", "layer3.6.bns.1.bias", "layer3.6.bns.1.running_mean", "layer3.6.bns.1.running_var", "layer3.6.bns.2.weight", "layer3.6.bns.2.bias", "layer3.6.bns.2.running_mean", "layer3.6.bns.2.running_var", "layer3.7.convs.0.weight", "layer3.7.convs.1.weight", "layer3.7.convs.2.weight", "layer3.7.bns.0.weight", "layer3.7.bns.0.bias", "layer3.7.bns.0.running_mean", "layer3.7.bns.0.running_var", "layer3.7.bns.1.weight", "layer3.7.bns.1.bias", "layer3.7.bns.1.running_mean", "layer3.7.bns.1.running_var", "layer3.7.bns.2.weight", "layer3.7.bns.2.bias", "layer3.7.bns.2.running_mean", "layer3.7.bns.2.running_var", "layer3.8.convs.0.weight", "layer3.8.convs.1.weight", "layer3.8.convs.2.weight", "layer3.8.bns.0.weight", "layer3.8.bns.0.bias", "layer3.8.bns.0.running_mean", "layer3.8.bns.0.running_var", "layer3.8.bns.1.weight", "layer3.8.bns.1.bias", "layer3.8.bns.1.running_mean", "layer3.8.bns.1.running_var", "layer3.8.bns.2.weight", "layer3.8.bns.2.bias", "layer3.8.bns.2.running_mean", "layer3.8.bns.2.running_var", "layer3.9.convs.0.weight", "layer3.9.convs.1.weight", "layer3.9.convs.2.weight", "layer3.9.bns.0.weight", "layer3.9.bns.0.bias", "layer3.9.bns.0.running_mean", "layer3.9.bns.0.running_var", "layer3.9.bns.1.weight", "layer3.9.bns.1.bias", "layer3.9.bns.1.running_mean", "layer3.9.bns.1.running_var", "layer3.9.bns.2.weight", "layer3.9.bns.2.bias", "layer3.9.bns.2.running_mean", "layer3.9.bns.2.running_var", "layer3.10.convs.0.weight", "layer3.10.convs.1.weight", "layer3.10.convs.2.weight", "layer3.10.bns.0.weight", "layer3.10.bns.0.bias", "layer3.10.bns.0.running_mean", "layer3.10.bns.0.running_var", "layer3.10.bns.1.weight", "layer3.10.bns.1.bias", "layer3.10.bns.1.running_mean", "layer3.10.bns.1.running_var", "layer3.10.bns.2.weight", "layer3.10.bns.2.bias", "layer3.10.bns.2.running_mean", "layer3.10.bns.2.running_var", "layer3.11.convs.0.weight", "layer3.11.convs.1.weight", "layer3.11.convs.2.weight", "layer3.11.bns.0.weight", "layer3.11.bns.0.bias", "layer3.11.bns.0.running_mean", "layer3.11.bns.0.running_var", "layer3.11.bns.1.weight", "layer3.11.bns.1.bias", "layer3.11.bns.1.running_mean", "layer3.11.bns.1.running_var", "layer3.11.bns.2.weight", "layer3.11.bns.2.bias", "layer3.11.bns.2.running_mean", "layer3.11.bns.2.running_var", "layer3.12.convs.0.weight", "layer3.12.convs.1.weight", "layer3.12.convs.2.weight", "layer3.12.bns.0.weight", "layer3.12.bns.0.bias", "layer3.12.bns.0.running_mean", "layer3.12.bns.0.running_var", "layer3.12.bns.1.weight", "layer3.12.bns.1.bias", "layer3.12.bns.1.running_mean", "layer3.12.bns.1.running_var", "layer3.12.bns.2.weight", "layer3.12.bns.2.bias", "layer3.12.bns.2.running_mean", "layer3.12.bns.2.running_var", "layer3.13.convs.0.weight", "layer3.13.convs.1.weight", "layer3.13.convs.2.weight", "layer3.13.bns.0.weight", "layer3.13.bns.0.bias", "layer3.13.bns.0.running_mean", "layer3.13.bns.0.running_var", "layer3.13.bns.1.weight", "layer3.13.bns.1.bias", "layer3.13.bns.1.running_mean", "layer3.13.bns.1.running_var", "layer3.13.bns.2.weight", "layer3.13.bns.2.bias", "layer3.13.bns.2.running_mean", "layer3.13.bns.2.running_var", "layer3.14.convs.0.weight", "layer3.14.convs.1.weight", "layer3.14.convs.2.weight", "layer3.14.bns.0.weight", "layer3.14.bns.0.bias", "layer3.14.bns.0.running_mean", "layer3.14.bns.0.running_var", "layer3.14.bns.1.weight", "layer3.14.bns.1.bias", "layer3.14.bns.1.running_mean", "layer3.14.bns.1.running_var", "layer3.14.bns.2.weight", "layer3.14.bns.2.bias", "layer3.14.bns.2.running_mean", "layer3.14.bns.2.running_var", "layer3.15.convs.0.weight", "layer3.15.convs.1.weight", "layer3.15.convs.2.weight", "layer3.15.bns.0.weight", "layer3.15.bns.0.bias", "layer3.15.bns.0.running_mean", "layer3.15.bns.0.running_var", "layer3.15.bns.1.weight", "layer3.15.bns.1.bias", "layer3.15.bns.1.running_mean", "layer3.15.bns.1.running_var", "layer3.15.bns.2.weight", "layer3.15.bns.2.bias", "layer3.15.bns.2.running_mean", "layer3.15.bns.2.running_var", "layer3.16.convs.0.weight", "layer3.16.convs.1.weight", "layer3.16.convs.2.weight", "layer3.16.bns.0.weight", "layer3.16.bns.0.bias", "layer3.16.bns.0.running_mean", "layer3.16.bns.0.running_var", "layer3.16.bns.1.weight", "layer3.16.bns.1.bias", "layer3.16.bns.1.running_mean", "layer3.16.bns.1.running_var", "layer3.16.bns.2.weight", "layer3.16.bns.2.bias", "layer3.16.bns.2.running_mean", "layer3.16.bns.2.running_var", "layer3.17.convs.0.weight", "layer3.17.convs.1.weight", "layer3.17.convs.2.weight", "layer3.17.bns.0.weight", "layer3.17.bns.0.bias", "layer3.17.bns.0.running_mean", "layer3.17.bns.0.running_var", "layer3.17.bns.1.weight", "layer3.17.bns.1.bias", "layer3.17.bns.1.running_mean", "layer3.17.bns.1.running_var", "layer3.17.bns.2.weight", "layer3.17.bns.2.bias", "layer3.17.bns.2.running_mean", "layer3.17.bns.2.running_var", "layer3.18.convs.0.weight", "layer3.18.convs.1.weight", "layer3.18.convs.2.weight", "layer3.18.bns.0.weight", "layer3.18.bns.0.bias", "layer3.18.bns.0.running_mean", "layer3.18.bns.0.running_var", "layer3.18.bns.1.weight", "layer3.18.bns.1.bias", "layer3.18.bns.1.running_mean", "layer3.18.bns.1.running_var", "layer3.18.bns.2.weight", "layer3.18.bns.2.bias", "layer3.18.bns.2.running_mean", "layer3.18.bns.2.running_var", "layer3.19.convs.0.weight", "layer3.19.convs.1.weight", "layer3.19.convs.2.weight", "layer3.19.bns.0.weight", "layer3.19.bns.0.bias", "layer3.19.bns.0.running_mean", "layer3.19.bns.0.running_var", "layer3.19.bns.1.weight", "layer3.19.bns.1.bias", "layer3.19.bns.1.running_mean", "layer3.19.bns.1.running_var", "layer3.19.bns.2.weight", "layer3.19.bns.2.bias", "layer3.19.bns.2.running_mean", "layer3.19.bns.2.running_var", "layer3.20.convs.0.weight", "layer3.20.convs.1.weight", "layer3.20.convs.2.weight", "layer3.20.bns.0.weight", "layer3.20.bns.0.bias", "layer3.20.bns.0.running_mean", "layer3.20.bns.0.running_var", "layer3.20.bns.1.weight", "layer3.20.bns.1.bias", "layer3.20.bns.1.running_mean", "layer3.20.bns.1.running_var", "layer3.20.bns.2.weight", "layer3.20.bns.2.bias", "layer3.20.bns.2.running_mean", "layer3.20.bns.2.running_var", "layer3.21.convs.0.weight", "layer3.21.convs.1.weight", "layer3.21.convs.2.weight", "layer3.21.bns.0.weight", "layer3.21.bns.0.bias", "layer3.21.bns.0.running_mean", "layer3.21.bns.0.running_var", "layer3.21.bns.1.weight", "layer3.21.bns.1.bias", "layer3.21.bns.1.running_mean", "layer3.21.bns.1.running_var", "layer3.21.bns.2.weight", "layer3.21.bns.2.bias", "layer3.21.bns.2.running_mean", "layer3.21.bns.2.running_var", "layer3.22.convs.0.weight", "layer3.22.convs.1.weight", "layer3.22.convs.2.weight", "layer3.22.bns.0.weight", "layer3.22.bns.0.bias", "layer3.22.bns.0.running_mean", "layer3.22.bns.0.running_var", "layer3.22.bns.1.weight", "layer3.22.bns.1.bias", "layer3.22.bns.1.running_mean", "layer3.22.bns.1.running_var", "layer3.22.bns.2.weight", "layer3.22.bns.2.bias", "layer3.22.bns.2.running_mean", "layer3.22.bns.2.running_var", "layer4.0.convs.0.weight", "layer4.0.convs.1.weight", "layer4.0.convs.2.weight", "layer4.0.bns.0.weight", "layer4.0.bns.0.bias", "layer4.0.bns.0.running_mean", "layer4.0.bns.0.running_var", "layer4.0.bns.1.weight", "layer4.0.bns.1.bias", "layer4.0.bns.1.running_mean", "layer4.0.bns.1.running_var", "layer4.0.bns.2.weight", "layer4.0.bns.2.bias", "layer4.0.bns.2.running_mean", "layer4.0.bns.2.running_var", "layer4.1.convs.0.weight", "layer4.1.convs.1.weight", "layer4.1.convs.2.weight", "layer4.1.bns.0.weight", "layer4.1.bns.0.bias", "layer4.1.bns.0.running_mean", "layer4.1.bns.0.running_var", "layer4.1.bns.1.weight", "layer4.1.bns.1.bias", "layer4.1.bns.1.running_mean", "layer4.1.bns.1.running_var", "layer4.1.bns.2.weight", "layer4.1.bns.2.bias", "layer4.1.bns.2.running_mean", "layer4.1.bns.2.running_var", "layer4.2.convs.0.weight", "layer4.2.convs.1.weight", "layer4.2.convs.2.weight", "layer4.2.bns.0.weight", "layer4.2.bns.0.bias", "layer4.2.bns.0.running_mean", "layer4.2.bns.0.running_var", "layer4.2.bns.1.weight", "layer4.2.bns.1.bias", "layer4.2.bns.1.running_mean", "layer4.2.bns.1.running_var", "layer4.2.bns.2.weight", "layer4.2.bns.2.bias", "layer4.2.bns.2.running_mean", "layer4.2.bns.2.running_var". Unexpected key(s) in state_dict: "layer1.0.conv2.weight", "layer1.0.bn2.running_mean", "layer1.0.bn2.running_var", "layer1.0.bn2.weight", "layer1.0.bn2.bias", "layer1.1.conv2.weight", "layer1.1.bn2.running_mean", "layer1.1.bn2.running_var", "layer1.1.bn2.weight", "layer1.1.bn2.bias", "layer1.2.conv2.weight", "layer1.2.bn2.running_mean", "layer1.2.bn2.running_var", "layer1.2.bn2.weight", "layer1.2.bn2.bias", "layer2.0.conv2.weight", "layer2.0.bn2.running_mean", "layer2.0.bn2.running_var", "layer2.0.bn2.weight", "layer2.0.bn2.bias", "layer2.1.conv2.weight", "layer2.1.bn2.running_mean", "layer2.1.bn2.running_var", "layer2.1.bn2.weight", "layer2.1.bn2.bias", "layer2.2.conv2.weight", "layer2.2.bn2.running_mean", "layer2.2.bn2.running_var", "layer2.2.bn2.weight", "layer2.2.bn2.bias", "layer2.3.conv2.weight", "layer2.3.bn2.running_mean", "layer2.3.bn2.running_var", "layer2.3.bn2.weight", "layer2.3.bn2.bias", "layer3.0.conv2.weight", "layer3.0.bn2.running_mean", "layer3.0.bn2.running_var", "layer3.0.bn2.weight", "layer3.0.bn2.bias", "layer3.1.conv2.weight", "layer3.1.bn2.running_mean", "layer3.1.bn2.running_var", "layer3.1.bn2.weight", "layer3.1.bn2.bias", "layer3.2.conv2.weight", "layer3.2.bn2.running_mean", "layer3.2.bn2.running_var", "layer3.2.bn2.weight", "layer3.2.bn2.bias", "layer3.3.conv2.weight", "layer3.3.bn2.running_mean", "layer3.3.bn2.running_var", "layer3.3.bn2.weight", "layer3.3.bn2.bias", "layer3.4.conv2.weight", "layer3.4.bn2.running_mean", "layer3.4.bn2.running_var", "layer3.4.bn2.weight", "layer3.4.bn2.bias", "layer3.5.conv2.weight", "layer3.5.bn2.running_mean", "layer3.5.bn2.running_var", "layer3.5.bn2.weight", "layer3.5.bn2.bias", "layer3.6.conv2.weight", "layer3.6.bn2.running_mean", "layer3.6.bn2.running_var", "layer3.6.bn2.weight", "layer3.6.bn2.bias", "layer3.7.conv2.weight", "layer3.7.bn2.running_mean", "layer3.7.bn2.running_var", "layer3.7.bn2.weight", "layer3.7.bn2.bias", "layer3.8.conv2.weight", "layer3.8.bn2.running_mean", "layer3.8.bn2.running_var", "layer3.8.bn2.weight", "layer3.8.bn2.bias", "layer3.9.conv2.weight", "layer3.9.bn2.running_mean", "layer3.9.bn2.running_var", "layer3.9.bn2.weight", "layer3.9.bn2.bias", "layer3.10.conv2.weight", "layer3.10.bn2.running_mean", "layer3.10.bn2.running_var", "layer3.10.bn2.weight", "layer3.10.bn2.bias", "layer3.11.conv2.weight", "layer3.11.bn2.running_mean", "layer3.11.bn2.running_var", "layer3.11.bn2.weight", "layer3.11.bn2.bias", "layer3.12.conv2.weight", "layer3.12.bn2.running_mean", "layer3.12.bn2.running_var", "layer3.12.bn2.weight", "layer3.12.bn2.bias", "layer3.13.conv2.weight", "layer3.13.bn2.running_mean", "layer3.13.bn2.running_var", "layer3.13.bn2.weight", "layer3.13.bn2.bias", "layer3.14.conv2.weight", "layer3.14.bn2.running_mean", "layer3.14.bn2.running_var", "layer3.14.bn2.weight", "layer3.14.bn2.bias", "layer3.15.conv2.weight", "layer3.15.bn2.running_mean", "layer3.15.bn2.running_var", "layer3.15.bn2.weight", "layer3.15.bn2.bias", "layer3.16.conv2.weight", "layer3.16.bn2.running_mean", "layer3.16.bn2.running_var", "layer3.16.bn2.weight", "layer3.16.bn2.bias", "layer3.17.conv2.weight", "layer3.17.bn2.running_mean", "layer3.17.bn2.running_var", "layer3.17.bn2.weight", "layer3.17.bn2.bias", "layer3.18.conv2.weight", "layer3.18.bn2.running_mean", "layer3.18.bn2.running_var", "layer3.18.bn2.weight", "layer3.18.bn2.bias", "layer3.19.conv2.weight", "layer3.19.bn2.running_mean", "layer3.19.bn2.running_var", "layer3.19.bn2.weight", "layer3.19.bn2.bias", "layer3.20.conv2.weight", "layer3.20.bn2.running_mean", "layer3.20.bn2.running_var", "layer3.20.bn2.weight", "layer3.20.bn2.bias", "layer3.21.conv2.weight", "layer3.21.bn2.running_mean", "layer3.21.bn2.running_var", "layer3.21.bn2.weight", "layer3.21.bn2.bias", "layer3.22.conv2.weight", "layer3.22.bn2.running_mean", "layer3.22.bn2.running_var", "layer3.22.bn2.weight", "layer3.22.bn2.bias", "layer4.0.conv2.weight", "layer4.0.bn2.running_mean", "layer4.0.bn2.running_var", "layer4.0.bn2.weight", "layer4.0.bn2.bias", "layer4.1.conv2.weight", "layer4.1.bn2.running_mean", "layer4.1.bn2.running_var", "layer4.1.bn2.weight", "layer4.1.bn2.bias", "layer4.2.conv2.weight", "layer4.2.bn2.running_mean", "layer4.2.bn2.running_var", "layer4.2.bn2.weight", "layer4.2.bn2.bias". size mismatch for layer1.0.conv1.weight: copying a param with shape torch.Size([64, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([104, 64, 1, 1]). size mismatch for layer1.0.bn1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.0.bn1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.0.bn1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.0.bn1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.0.conv3.weight: copying a param with shape torch.Size([256, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 104, 1, 1]). size mismatch for layer1.1.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([104, 256, 1, 1]). size mismatch for layer1.1.bn1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.1.bn1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.1.bn1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.1.bn1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.1.conv3.weight: copying a param with shape torch.Size([256, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 104, 1, 1]). size mismatch for layer1.2.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([104, 256, 1, 1]). size mismatch for layer1.2.bn1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.2.bn1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.2.bn1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.2.bn1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.2.conv3.weight: copying a param with shape torch.Size([256, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 104, 1, 1]). size mismatch for layer2.0.conv1.weight: copying a param with shape torch.Size([128, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([208, 256, 1, 1]). size mismatch for layer2.0.bn1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.0.bn1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.0.bn1.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.0.bn1.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.0.conv3.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 208, 1, 1]). size mismatch for layer2.1.conv1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([208, 512, 1, 1]). size mismatch for layer2.1.bn1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.1.bn1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.1.bn1.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.1.bn1.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.1.conv3.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 208, 1, 1]). size mismatch for layer2.2.conv1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([208, 512, 1, 1]). size mismatch for layer2.2.bn1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.2.bn1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.2.bn1.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.2.bn1.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.2.conv3.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 208, 1, 1]). size mismatch for layer2.3.conv1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([208, 512, 1, 1]). size mismatch for layer2.3.bn1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.3.bn1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.3.bn1.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.3.bn1.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.3.conv3.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 208, 1, 1]). size mismatch for layer3.0.conv1.weight: copying a param with shape torch.Size([256, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 512, 1, 1]). size mismatch for layer3.0.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.0.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.0.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.0.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.0.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.1.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.1.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.1.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.1.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.1.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.1.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.2.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.2.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.2.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.2.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.2.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.2.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.3.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.3.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.3.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.3.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.3.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.3.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.4.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.4.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.4.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.4.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.4.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.4.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.5.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.5.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.5.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.5.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.5.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.5.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.6.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.6.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.6.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.6.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.6.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.6.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.7.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.7.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.7.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.7.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.7.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.7.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.8.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.8.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.8.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.8.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.8.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.8.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.9.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.9.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.9.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.9.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.9.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.9.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.10.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.10.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.10.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.10.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.10.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.10.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.11.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.11.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.11.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.11.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.11.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.11.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.12.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.12.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.12.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.12.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.12.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.12.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.13.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.13.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.13.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.13.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.13.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.13.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.14.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.14.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.14.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.14.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.14.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.14.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.15.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.15.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.15.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.15.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.15.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.15.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.16.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.16.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.16.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.16.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.16.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.16.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.17.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.17.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.17.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.17.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.17.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.17.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.18.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.18.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.18.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.18.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.18.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.18.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.19.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.19.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.19.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.19.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.19.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.19.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.20.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.20.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.20.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.20.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.20.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.20.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.21.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.21.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.21.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.21.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.21.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.21.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.22.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.22.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.22.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.22.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.22.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.22.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer4.0.conv1.weight: copying a param with shape torch.Size([512, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([832, 1024, 1, 1]). size mismatch for layer4.0.bn1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.0.bn1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.0.bn1.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.0.bn1.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.0.conv3.weight: copying a param with shape torch.Size([2048, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([2048, 832, 1, 1]). size mismatch for layer4.1.conv1.weight: copying a param with shape torch.Size([512, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([832, 2048, 1, 1]). size mismatch for layer4.1.bn1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.1.bn1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.1.bn1.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.1.bn1.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.1.conv3.weight: copying a param with shape torch.Size([2048, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([2048, 832, 1, 1]). size mismatch for layer4.2.conv1.weight: copying a param with shape torch.Size([512, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([832, 2048, 1, 1]). size mismatch for layer4.2.bn1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.2.bn1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.2.bn1.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.2.bn1.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.2.conv3.weight: #copying a param with shape torch.Size([2048, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([2048, 832, 1, 1]).

I have tested the model downloaded from Onedirve, and it works fine. My testing code is :

from res2net import res2net101_26w_4s
model = res2net101_26w_4s(pretrained=True)
print(model)

I put the res2net101_26w_4s-02a759a1.pth model into the default path of pretrained models, in my case is /home/shanghuagao/.cache/torch/checkpoints/. I assume that you might mistakenly use the pretrained model or model definition of ResNet.

Hi, the error is as follows:

RuntimeError: Error(s) in loading state_dict for Res2Net: Missing key(s) in state_dict: "layer1.0.convs.0.weight", "layer1.0.convs.1.weight", "layer1.0.convs.2.weight", "layer1.0.bns.0.weight", "layer1.0.bns.0.bias", "layer1.0.bns.0.running_mean", "layer1.0.bns.0.running_var", "layer1.0.bns.1.weight", "layer1.0.bns.1.bias", "layer1.0.bns.1.running_mean", "layer1.0.bns.1.running_var", "layer1.0.bns.2.weight", "layer1.0.bns.2.bias", "layer1.0.bns.2.running_mean", "layer1.0.bns.2.running_var", "layer1.1.convs.0.weight", "layer1.1.convs.1.weight", "layer1.1.convs.2.weight", "layer1.1.bns.0.weight", "layer1.1.bns.0.bias", "layer1.1.bns.0.running_mean", "layer1.1.bns.0.running_var", "layer1.1.bns.1.weight", "layer1.1.bns.1.bias", "layer1.1.bns.1.running_mean", "layer1.1.bns.1.running_var", "layer1.1.bns.2.weight", "layer1.1.bns.2.bias", "layer1.1.bns.2.running_mean", "layer1.1.bns.2.running_var", "layer1.2.convs.0.weight", "layer1.2.convs.1.weight", "layer1.2.convs.2.weight", "layer1.2.bns.0.weight", "layer1.2.bns.0.bias", "layer1.2.bns.0.running_mean", "layer1.2.bns.0.running_var", "layer1.2.bns.1.weight", "layer1.2.bns.1.bias", "layer1.2.bns.1.running_mean", "layer1.2.bns.1.running_var", "layer1.2.bns.2.weight", "layer1.2.bns.2.bias", "layer1.2.bns.2.running_mean", "layer1.2.bns.2.running_var", "layer2.0.convs.0.weight", "layer2.0.convs.1.weight", "layer2.0.convs.2.weight", "layer2.0.bns.0.weight", "layer2.0.bns.0.bias", "layer2.0.bns.0.running_mean", "layer2.0.bns.0.running_var", "layer2.0.bns.1.weight", "layer2.0.bns.1.bias", "layer2.0.bns.1.running_mean", "layer2.0.bns.1.running_var", "layer2.0.bns.2.weight", "layer2.0.bns.2.bias", "layer2.0.bns.2.running_mean", "layer2.0.bns.2.running_var", "layer2.1.convs.0.weight", "layer2.1.convs.1.weight", "layer2.1.convs.2.weight", "layer2.1.bns.0.weight", "layer2.1.bns.0.bias", "layer2.1.bns.0.running_mean", "layer2.1.bns.0.running_var", "layer2.1.bns.1.weight", "layer2.1.bns.1.bias", "layer2.1.bns.1.running_mean", "layer2.1.bns.1.running_var", "layer2.1.bns.2.weight", "layer2.1.bns.2.bias", "layer2.1.bns.2.running_mean", "layer2.1.bns.2.running_var", "layer2.2.convs.0.weight", "layer2.2.convs.1.weight", "layer2.2.convs.2.weight", "layer2.2.bns.0.weight", "layer2.2.bns.0.bias", "layer2.2.bns.0.running_mean", "layer2.2.bns.0.running_var", "layer2.2.bns.1.weight", "layer2.2.bns.1.bias", "layer2.2.bns.1.running_mean", "layer2.2.bns.1.running_var", "layer2.2.bns.2.weight", "layer2.2.bns.2.bias", "layer2.2.bns.2.running_mean", "layer2.2.bns.2.running_var", "layer2.3.convs.0.weight", "layer2.3.convs.1.weight", "layer2.3.convs.2.weight", "layer2.3.bns.0.weight", "layer2.3.bns.0.bias", "layer2.3.bns.0.running_mean", "layer2.3.bns.0.running_var", "layer2.3.bns.1.weight", "layer2.3.bns.1.bias", "layer2.3.bns.1.running_mean", "layer2.3.bns.1.running_var", "layer2.3.bns.2.weight", "layer2.3.bns.2.bias", "layer2.3.bns.2.running_mean", "layer2.3.bns.2.running_var", "layer3.0.convs.0.weight", "layer3.0.convs.1.weight", "layer3.0.convs.2.weight", "layer3.0.bns.0.weight", "layer3.0.bns.0.bias", "layer3.0.bns.0.running_mean", "layer3.0.bns.0.running_var", "layer3.0.bns.1.weight", "layer3.0.bns.1.bias", "layer3.0.bns.1.running_mean", "layer3.0.bns.1.running_var", "layer3.0.bns.2.weight", "layer3.0.bns.2.bias", "layer3.0.bns.2.running_mean", "layer3.0.bns.2.running_var", "layer3.1.convs.0.weight", "layer3.1.convs.1.weight", "layer3.1.convs.2.weight", "layer3.1.bns.0.weight", "layer3.1.bns.0.bias", "layer3.1.bns.0.running_mean", "layer3.1.bns.0.running_var", "layer3.1.bns.1.weight", "layer3.1.bns.1.bias", "layer3.1.bns.1.running_mean", "layer3.1.bns.1.running_var", "layer3.1.bns.2.weight", "layer3.1.bns.2.bias", "layer3.1.bns.2.running_mean", "layer3.1.bns.2.running_var", "layer3.2.convs.0.weight", "layer3.2.convs.1.weight", "layer3.2.convs.2.weight", "layer3.2.bns.0.weight", "layer3.2.bns.0.bias", "layer3.2.bns.0.running_mean", "layer3.2.bns.0.running_var", "layer3.2.bns.1.weight", "layer3.2.bns.1.bias", "layer3.2.bns.1.running_mean", "layer3.2.bns.1.running_var", "layer3.2.bns.2.weight", "layer3.2.bns.2.bias", "layer3.2.bns.2.running_mean", "layer3.2.bns.2.running_var", "layer3.3.convs.0.weight", "layer3.3.convs.1.weight", "layer3.3.convs.2.weight", "layer3.3.bns.0.weight", "layer3.3.bns.0.bias", "layer3.3.bns.0.running_mean", "layer3.3.bns.0.running_var", "layer3.3.bns.1.weight", "layer3.3.bns.1.bias", "layer3.3.bns.1.running_mean", "layer3.3.bns.1.running_var", "layer3.3.bns.2.weight", "layer3.3.bns.2.bias", "layer3.3.bns.2.running_mean", "layer3.3.bns.2.running_var", "layer3.4.convs.0.weight", "layer3.4.convs.1.weight", "layer3.4.convs.2.weight", "layer3.4.bns.0.weight", "layer3.4.bns.0.bias", "layer3.4.bns.0.running_mean", "layer3.4.bns.0.running_var", "layer3.4.bns.1.weight", "layer3.4.bns.1.bias", "layer3.4.bns.1.running_mean", "layer3.4.bns.1.running_var", "layer3.4.bns.2.weight", "layer3.4.bns.2.bias", "layer3.4.bns.2.running_mean", "layer3.4.bns.2.running_var", "layer3.5.convs.0.weight", "layer3.5.convs.1.weight", "layer3.5.convs.2.weight", "layer3.5.bns.0.weight", "layer3.5.bns.0.bias", "layer3.5.bns.0.running_mean", "layer3.5.bns.0.running_var", "layer3.5.bns.1.weight", "layer3.5.bns.1.bias", "layer3.5.bns.1.running_mean", "layer3.5.bns.1.running_var", "layer3.5.bns.2.weight", "layer3.5.bns.2.bias", "layer3.5.bns.2.running_mean", "layer3.5.bns.2.running_var", "layer3.6.convs.0.weight", "layer3.6.convs.1.weight", "layer3.6.convs.2.weight", "layer3.6.bns.0.weight", "layer3.6.bns.0.bias", "layer3.6.bns.0.running_mean", "layer3.6.bns.0.running_var", "layer3.6.bns.1.weight", "layer3.6.bns.1.bias", "layer3.6.bns.1.running_mean", "layer3.6.bns.1.running_var", "layer3.6.bns.2.weight", "layer3.6.bns.2.bias", "layer3.6.bns.2.running_mean", "layer3.6.bns.2.running_var", "layer3.7.convs.0.weight", "layer3.7.convs.1.weight", "layer3.7.convs.2.weight", "layer3.7.bns.0.weight", "layer3.7.bns.0.bias", "layer3.7.bns.0.running_mean", "layer3.7.bns.0.running_var", "layer3.7.bns.1.weight", "layer3.7.bns.1.bias", "layer3.7.bns.1.running_mean", "layer3.7.bns.1.running_var", "layer3.7.bns.2.weight", "layer3.7.bns.2.bias", "layer3.7.bns.2.running_mean", "layer3.7.bns.2.running_var", "layer3.8.convs.0.weight", "layer3.8.convs.1.weight", "layer3.8.convs.2.weight", "layer3.8.bns.0.weight", "layer3.8.bns.0.bias", "layer3.8.bns.0.running_mean", "layer3.8.bns.0.running_var", "layer3.8.bns.1.weight", "layer3.8.bns.1.bias", "layer3.8.bns.1.running_mean", "layer3.8.bns.1.running_var", "layer3.8.bns.2.weight", "layer3.8.bns.2.bias", "layer3.8.bns.2.running_mean", "layer3.8.bns.2.running_var", "layer3.9.convs.0.weight", "layer3.9.convs.1.weight", "layer3.9.convs.2.weight", "layer3.9.bns.0.weight", "layer3.9.bns.0.bias", "layer3.9.bns.0.running_mean", "layer3.9.bns.0.running_var", "layer3.9.bns.1.weight", "layer3.9.bns.1.bias", "layer3.9.bns.1.running_mean", "layer3.9.bns.1.running_var", "layer3.9.bns.2.weight", "layer3.9.bns.2.bias", "layer3.9.bns.2.running_mean", "layer3.9.bns.2.running_var", "layer3.10.convs.0.weight", "layer3.10.convs.1.weight", "layer3.10.convs.2.weight", "layer3.10.bns.0.weight", "layer3.10.bns.0.bias", "layer3.10.bns.0.running_mean", "layer3.10.bns.0.running_var", "layer3.10.bns.1.weight", "layer3.10.bns.1.bias", "layer3.10.bns.1.running_mean", "layer3.10.bns.1.running_var", "layer3.10.bns.2.weight", "layer3.10.bns.2.bias", "layer3.10.bns.2.running_mean", "layer3.10.bns.2.running_var", "layer3.11.convs.0.weight", "layer3.11.convs.1.weight", "layer3.11.convs.2.weight", "layer3.11.bns.0.weight", "layer3.11.bns.0.bias", "layer3.11.bns.0.running_mean", "layer3.11.bns.0.running_var", "layer3.11.bns.1.weight", "layer3.11.bns.1.bias", "layer3.11.bns.1.running_mean", "layer3.11.bns.1.running_var", "layer3.11.bns.2.weight", "layer3.11.bns.2.bias", "layer3.11.bns.2.running_mean", "layer3.11.bns.2.running_var", "layer3.12.convs.0.weight", "layer3.12.convs.1.weight", "layer3.12.convs.2.weight", "layer3.12.bns.0.weight", "layer3.12.bns.0.bias", "layer3.12.bns.0.running_mean", "layer3.12.bns.0.running_var", "layer3.12.bns.1.weight", "layer3.12.bns.1.bias", "layer3.12.bns.1.running_mean", "layer3.12.bns.1.running_var", "layer3.12.bns.2.weight", "layer3.12.bns.2.bias", "layer3.12.bns.2.running_mean", "layer3.12.bns.2.running_var", "layer3.13.convs.0.weight", "layer3.13.convs.1.weight", "layer3.13.convs.2.weight", "layer3.13.bns.0.weight", "layer3.13.bns.0.bias", "layer3.13.bns.0.running_mean", "layer3.13.bns.0.running_var", "layer3.13.bns.1.weight", "layer3.13.bns.1.bias", "layer3.13.bns.1.running_mean", "layer3.13.bns.1.running_var", "layer3.13.bns.2.weight", "layer3.13.bns.2.bias", "layer3.13.bns.2.running_mean", "layer3.13.bns.2.running_var", "layer3.14.convs.0.weight", "layer3.14.convs.1.weight", "layer3.14.convs.2.weight", "layer3.14.bns.0.weight", "layer3.14.bns.0.bias", "layer3.14.bns.0.running_mean", "layer3.14.bns.0.running_var", "layer3.14.bns.1.weight", "layer3.14.bns.1.bias", "layer3.14.bns.1.running_mean", "layer3.14.bns.1.running_var", "layer3.14.bns.2.weight", "layer3.14.bns.2.bias", "layer3.14.bns.2.running_mean", "layer3.14.bns.2.running_var", "layer3.15.convs.0.weight", "layer3.15.convs.1.weight", "layer3.15.convs.2.weight", "layer3.15.bns.0.weight", "layer3.15.bns.0.bias", "layer3.15.bns.0.running_mean", "layer3.15.bns.0.running_var", "layer3.15.bns.1.weight", "layer3.15.bns.1.bias", "layer3.15.bns.1.running_mean", "layer3.15.bns.1.running_var", "layer3.15.bns.2.weight", "layer3.15.bns.2.bias", "layer3.15.bns.2.running_mean", "layer3.15.bns.2.running_var", "layer3.16.convs.0.weight", "layer3.16.convs.1.weight", "layer3.16.convs.2.weight", "layer3.16.bns.0.weight", "layer3.16.bns.0.bias", "layer3.16.bns.0.running_mean", "layer3.16.bns.0.running_var", "layer3.16.bns.1.weight", "layer3.16.bns.1.bias", "layer3.16.bns.1.running_mean", "layer3.16.bns.1.running_var", "layer3.16.bns.2.weight", "layer3.16.bns.2.bias", "layer3.16.bns.2.running_mean", "layer3.16.bns.2.running_var", "layer3.17.convs.0.weight", "layer3.17.convs.1.weight", "layer3.17.convs.2.weight", "layer3.17.bns.0.weight", "layer3.17.bns.0.bias", "layer3.17.bns.0.running_mean", "layer3.17.bns.0.running_var", "layer3.17.bns.1.weight", "layer3.17.bns.1.bias", "layer3.17.bns.1.running_mean", "layer3.17.bns.1.running_var", "layer3.17.bns.2.weight", "layer3.17.bns.2.bias", "layer3.17.bns.2.running_mean", "layer3.17.bns.2.running_var", "layer3.18.convs.0.weight", "layer3.18.convs.1.weight", "layer3.18.convs.2.weight", "layer3.18.bns.0.weight", "layer3.18.bns.0.bias", "layer3.18.bns.0.running_mean", "layer3.18.bns.0.running_var", "layer3.18.bns.1.weight", "layer3.18.bns.1.bias", "layer3.18.bns.1.running_mean", "layer3.18.bns.1.running_var", "layer3.18.bns.2.weight", "layer3.18.bns.2.bias", "layer3.18.bns.2.running_mean", "layer3.18.bns.2.running_var", "layer3.19.convs.0.weight", "layer3.19.convs.1.weight", "layer3.19.convs.2.weight", "layer3.19.bns.0.weight", "layer3.19.bns.0.bias", "layer3.19.bns.0.running_mean", "layer3.19.bns.0.running_var", "layer3.19.bns.1.weight", "layer3.19.bns.1.bias", "layer3.19.bns.1.running_mean", "layer3.19.bns.1.running_var", "layer3.19.bns.2.weight", "layer3.19.bns.2.bias", "layer3.19.bns.2.running_mean", "layer3.19.bns.2.running_var", "layer3.20.convs.0.weight", "layer3.20.convs.1.weight", "layer3.20.convs.2.weight", "layer3.20.bns.0.weight", "layer3.20.bns.0.bias", "layer3.20.bns.0.running_mean", "layer3.20.bns.0.running_var", "layer3.20.bns.1.weight", "layer3.20.bns.1.bias", "layer3.20.bns.1.running_mean", "layer3.20.bns.1.running_var", "layer3.20.bns.2.weight", "layer3.20.bns.2.bias", "layer3.20.bns.2.running_mean", "layer3.20.bns.2.running_var", "layer3.21.convs.0.weight", "layer3.21.convs.1.weight", "layer3.21.convs.2.weight", "layer3.21.bns.0.weight", "layer3.21.bns.0.bias", "layer3.21.bns.0.running_mean", "layer3.21.bns.0.running_var", "layer3.21.bns.1.weight", "layer3.21.bns.1.bias", "layer3.21.bns.1.running_mean", "layer3.21.bns.1.running_var", "layer3.21.bns.2.weight", "layer3.21.bns.2.bias", "layer3.21.bns.2.running_mean", "layer3.21.bns.2.running_var", "layer3.22.convs.0.weight", "layer3.22.convs.1.weight", "layer3.22.convs.2.weight", "layer3.22.bns.0.weight", "layer3.22.bns.0.bias", "layer3.22.bns.0.running_mean", "layer3.22.bns.0.running_var", "layer3.22.bns.1.weight", "layer3.22.bns.1.bias", "layer3.22.bns.1.running_mean", "layer3.22.bns.1.running_var", "layer3.22.bns.2.weight", "layer3.22.bns.2.bias", "layer3.22.bns.2.running_mean", "layer3.22.bns.2.running_var", "layer4.0.convs.0.weight", "layer4.0.convs.1.weight", "layer4.0.convs.2.weight", "layer4.0.bns.0.weight", "layer4.0.bns.0.bias", "layer4.0.bns.0.running_mean", "layer4.0.bns.0.running_var", "layer4.0.bns.1.weight", "layer4.0.bns.1.bias", "layer4.0.bns.1.running_mean", "layer4.0.bns.1.running_var", "layer4.0.bns.2.weight", "layer4.0.bns.2.bias", "layer4.0.bns.2.running_mean", "layer4.0.bns.2.running_var", "layer4.1.convs.0.weight", "layer4.1.convs.1.weight", "layer4.1.convs.2.weight", "layer4.1.bns.0.weight", "layer4.1.bns.0.bias", "layer4.1.bns.0.running_mean", "layer4.1.bns.0.running_var", "layer4.1.bns.1.weight", "layer4.1.bns.1.bias", "layer4.1.bns.1.running_mean", "layer4.1.bns.1.running_var", "layer4.1.bns.2.weight", "layer4.1.bns.2.bias", "layer4.1.bns.2.running_mean", "layer4.1.bns.2.running_var", "layer4.2.convs.0.weight", "layer4.2.convs.1.weight", "layer4.2.convs.2.weight", "layer4.2.bns.0.weight", "layer4.2.bns.0.bias", "layer4.2.bns.0.running_mean", "layer4.2.bns.0.running_var", "layer4.2.bns.1.weight", "layer4.2.bns.1.bias", "layer4.2.bns.1.running_mean", "layer4.2.bns.1.running_var", "layer4.2.bns.2.weight", "layer4.2.bns.2.bias", "layer4.2.bns.2.running_mean", "layer4.2.bns.2.running_var". Unexpected key(s) in state_dict: "layer1.0.conv2.weight", "layer1.0.bn2.running_mean", "layer1.0.bn2.running_var", "layer1.0.bn2.weight", "layer1.0.bn2.bias", "layer1.1.conv2.weight", "layer1.1.bn2.running_mean", "layer1.1.bn2.running_var", "layer1.1.bn2.weight", "layer1.1.bn2.bias", "layer1.2.conv2.weight", "layer1.2.bn2.running_mean", "layer1.2.bn2.running_var", "layer1.2.bn2.weight", "layer1.2.bn2.bias", "layer2.0.conv2.weight", "layer2.0.bn2.running_mean", "layer2.0.bn2.running_var", "layer2.0.bn2.weight", "layer2.0.bn2.bias", "layer2.1.conv2.weight", "layer2.1.bn2.running_mean", "layer2.1.bn2.running_var", "layer2.1.bn2.weight", "layer2.1.bn2.bias", "layer2.2.conv2.weight", "layer2.2.bn2.running_mean", "layer2.2.bn2.running_var", "layer2.2.bn2.weight", "layer2.2.bn2.bias", "layer2.3.conv2.weight", "layer2.3.bn2.running_mean", "layer2.3.bn2.running_var", "layer2.3.bn2.weight", "layer2.3.bn2.bias", "layer3.0.conv2.weight", "layer3.0.bn2.running_mean", "layer3.0.bn2.running_var", "layer3.0.bn2.weight", "layer3.0.bn2.bias", "layer3.1.conv2.weight", "layer3.1.bn2.running_mean", "layer3.1.bn2.running_var", "layer3.1.bn2.weight", "layer3.1.bn2.bias", "layer3.2.conv2.weight", "layer3.2.bn2.running_mean", "layer3.2.bn2.running_var", "layer3.2.bn2.weight", "layer3.2.bn2.bias", "layer3.3.conv2.weight", "layer3.3.bn2.running_mean", "layer3.3.bn2.running_var", "layer3.3.bn2.weight", "layer3.3.bn2.bias", "layer3.4.conv2.weight", "layer3.4.bn2.running_mean", "layer3.4.bn2.running_var", "layer3.4.bn2.weight", "layer3.4.bn2.bias", "layer3.5.conv2.weight", "layer3.5.bn2.running_mean", "layer3.5.bn2.running_var", "layer3.5.bn2.weight", "layer3.5.bn2.bias", "layer3.6.conv2.weight", "layer3.6.bn2.running_mean", "layer3.6.bn2.running_var", "layer3.6.bn2.weight", "layer3.6.bn2.bias", "layer3.7.conv2.weight", "layer3.7.bn2.running_mean", "layer3.7.bn2.running_var", "layer3.7.bn2.weight", "layer3.7.bn2.bias", "layer3.8.conv2.weight", "layer3.8.bn2.running_mean", "layer3.8.bn2.running_var", "layer3.8.bn2.weight", "layer3.8.bn2.bias", "layer3.9.conv2.weight", "layer3.9.bn2.running_mean", "layer3.9.bn2.running_var", "layer3.9.bn2.weight", "layer3.9.bn2.bias", "layer3.10.conv2.weight", "layer3.10.bn2.running_mean", "layer3.10.bn2.running_var", "layer3.10.bn2.weight", "layer3.10.bn2.bias", "layer3.11.conv2.weight", "layer3.11.bn2.running_mean", "layer3.11.bn2.running_var", "layer3.11.bn2.weight", "layer3.11.bn2.bias", "layer3.12.conv2.weight", "layer3.12.bn2.running_mean", "layer3.12.bn2.running_var", "layer3.12.bn2.weight", "layer3.12.bn2.bias", "layer3.13.conv2.weight", "layer3.13.bn2.running_mean", "layer3.13.bn2.running_var", "layer3.13.bn2.weight", "layer3.13.bn2.bias", "layer3.14.conv2.weight", "layer3.14.bn2.running_mean", "layer3.14.bn2.running_var", "layer3.14.bn2.weight", "layer3.14.bn2.bias", "layer3.15.conv2.weight", "layer3.15.bn2.running_mean", "layer3.15.bn2.running_var", "layer3.15.bn2.weight", "layer3.15.bn2.bias", "layer3.16.conv2.weight", "layer3.16.bn2.running_mean", "layer3.16.bn2.running_var", "layer3.16.bn2.weight", "layer3.16.bn2.bias", "layer3.17.conv2.weight", "layer3.17.bn2.running_mean", "layer3.17.bn2.running_var", "layer3.17.bn2.weight", "layer3.17.bn2.bias", "layer3.18.conv2.weight", "layer3.18.bn2.running_mean", "layer3.18.bn2.running_var", "layer3.18.bn2.weight", "layer3.18.bn2.bias", "layer3.19.conv2.weight", "layer3.19.bn2.running_mean", "layer3.19.bn2.running_var", "layer3.19.bn2.weight", "layer3.19.bn2.bias", "layer3.20.conv2.weight", "layer3.20.bn2.running_mean", "layer3.20.bn2.running_var", "layer3.20.bn2.weight", "layer3.20.bn2.bias", "layer3.21.conv2.weight", "layer3.21.bn2.running_mean", "layer3.21.bn2.running_var", "layer3.21.bn2.weight", "layer3.21.bn2.bias", "layer3.22.conv2.weight", "layer3.22.bn2.running_mean", "layer3.22.bn2.running_var", "layer3.22.bn2.weight", "layer3.22.bn2.bias", "layer4.0.conv2.weight", "layer4.0.bn2.running_mean", "layer4.0.bn2.running_var", "layer4.0.bn2.weight", "layer4.0.bn2.bias", "layer4.1.conv2.weight", "layer4.1.bn2.running_mean", "layer4.1.bn2.running_var", "layer4.1.bn2.weight", "layer4.1.bn2.bias", "layer4.2.conv2.weight", "layer4.2.bn2.running_mean", "layer4.2.bn2.running_var", "layer4.2.bn2.weight", "layer4.2.bn2.bias". size mismatch for layer1.0.conv1.weight: copying a param with shape torch.Size([64, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([104, 64, 1, 1]). size mismatch for layer1.0.bn1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.0.bn1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.0.bn1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.0.bn1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.0.conv3.weight: copying a param with shape torch.Size([256, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 104, 1, 1]). size mismatch for layer1.1.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([104, 256, 1, 1]). size mismatch for layer1.1.bn1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.1.bn1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.1.bn1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.1.bn1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.1.conv3.weight: copying a param with shape torch.Size([256, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 104, 1, 1]). size mismatch for layer1.2.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([104, 256, 1, 1]). size mismatch for layer1.2.bn1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.2.bn1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.2.bn1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.2.bn1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.2.conv3.weight: copying a param with shape torch.Size([256, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 104, 1, 1]). size mismatch for layer2.0.conv1.weight: copying a param with shape torch.Size([128, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([208, 256, 1, 1]). size mismatch for layer2.0.bn1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.0.bn1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.0.bn1.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.0.bn1.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.0.conv3.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 208, 1, 1]). size mismatch for layer2.1.conv1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([208, 512, 1, 1]). size mismatch for layer2.1.bn1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.1.bn1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.1.bn1.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.1.bn1.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.1.conv3.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 208, 1, 1]). size mismatch for layer2.2.conv1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([208, 512, 1, 1]). size mismatch for layer2.2.bn1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.2.bn1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.2.bn1.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.2.bn1.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.2.conv3.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 208, 1, 1]). size mismatch for layer2.3.conv1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([208, 512, 1, 1]). size mismatch for layer2.3.bn1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.3.bn1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.3.bn1.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.3.bn1.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.3.conv3.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 208, 1, 1]). size mismatch for layer3.0.conv1.weight: copying a param with shape torch.Size([256, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 512, 1, 1]). size mismatch for layer3.0.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.0.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.0.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.0.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.0.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.1.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.1.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.1.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.1.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.1.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.1.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.2.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.2.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.2.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.2.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.2.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.2.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.3.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.3.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.3.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.3.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.3.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.3.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.4.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.4.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.4.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.4.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.4.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.4.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.5.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.5.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.5.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.5.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.5.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.5.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.6.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.6.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.6.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.6.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.6.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.6.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.7.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.7.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.7.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.7.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.7.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.7.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.8.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.8.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.8.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.8.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.8.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.8.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.9.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.9.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.9.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.9.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.9.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.9.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.10.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.10.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.10.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.10.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.10.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.10.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.11.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.11.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.11.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.11.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.11.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.11.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.12.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.12.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.12.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.12.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.12.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.12.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.13.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.13.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.13.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.13.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.13.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.13.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.14.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.14.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.14.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.14.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.14.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.14.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.15.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.15.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.15.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.15.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.15.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.15.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.16.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.16.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.16.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.16.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.16.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.16.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.17.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.17.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.17.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.17.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.17.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.17.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.18.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.18.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.18.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.18.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.18.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.18.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.19.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.19.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.19.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.19.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.19.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.19.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.20.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.20.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.20.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.20.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.20.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.20.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.21.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.21.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.21.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.21.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.21.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.21.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.22.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.22.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.22.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.22.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.22.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.22.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer4.0.conv1.weight: copying a param with shape torch.Size([512, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([832, 1024, 1, 1]). size mismatch for layer4.0.bn1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.0.bn1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.0.bn1.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.0.bn1.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.0.conv3.weight: copying a param with shape torch.Size([2048, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([2048, 832, 1, 1]). size mismatch for layer4.1.conv1.weight: copying a param with shape torch.Size([512, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([832, 2048, 1, 1]). size mismatch for layer4.1.bn1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.1.bn1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.1.bn1.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.1.bn1.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.1.conv3.weight: copying a param with shape torch.Size([2048, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([2048, 832, 1, 1]). size mismatch for layer4.2.conv1.weight: copying a param with shape torch.Size([512, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([832, 2048, 1, 1]). size mismatch for layer4.2.bn1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.2.bn1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.2.bn1.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.2.bn1.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.2.conv3.weight: #copying a param with shape torch.Size([2048, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([2048, 832, 1, 1]).

hi . i have the same issue,canyou reslove it?

i have the same problem.but i have solve it. you don't change the .yaml file.one word,it work. i add TEST_PERIOD: 10000 model.it don't work.I am thinking the problem.if you solve ,please contact me.

Hi, the error is as follows: RuntimeError: Error(s) in loading state_dict for Res2Net: Missing key(s) in state_dict: "layer1.0.convs.0.weight", "layer1.0.convs.1.weight", "layer1.0.convs.2.weight", "layer1.0.bns.0.weight", "layer1.0.bns.0.bias", "layer1.0.bns.0.running_mean", "layer1.0.bns.0.running_var", "layer1.0.bns.1.weight", "layer1.0.bns.1.bias", "layer1.0.bns.1.running_mean", "layer1.0.bns.1.running_var", "layer1.0.bns.2.weight", "layer1.0.bns.2.bias", "layer1.0.bns.2.running_mean", "layer1.0.bns.2.running_var", "layer1.1.convs.0.weight", "layer1.1.convs.1.weight", "layer1.1.convs.2.weight", "layer1.1.bns.0.weight", "layer1.1.bns.0.bias", "layer1.1.bns.0.running_mean", "layer1.1.bns.0.running_var", "layer1.1.bns.1.weight", "layer1.1.bns.1.bias", "layer1.1.bns.1.running_mean", "layer1.1.bns.1.running_var", "layer1.1.bns.2.weight", "layer1.1.bns.2.bias", "layer1.1.bns.2.running_mean", "layer1.1.bns.2.running_var", "layer1.2.convs.0.weight", "layer1.2.convs.1.weight", "layer1.2.convs.2.weight", "layer1.2.bns.0.weight", "layer1.2.bns.0.bias", "layer1.2.bns.0.running_mean", "layer1.2.bns.0.running_var", "layer1.2.bns.1.weight", "layer1.2.bns.1.bias", "layer1.2.bns.1.running_mean", "layer1.2.bns.1.running_var", "layer1.2.bns.2.weight", "layer1.2.bns.2.bias", "layer1.2.bns.2.running_mean", "layer1.2.bns.2.running_var", "layer2.0.convs.0.weight", "layer2.0.convs.1.weight", "layer2.0.convs.2.weight", "layer2.0.bns.0.weight", "layer2.0.bns.0.bias", "layer2.0.bns.0.running_mean", "layer2.0.bns.0.running_var", "layer2.0.bns.1.weight", "layer2.0.bns.1.bias", "layer2.0.bns.1.running_mean", "layer2.0.bns.1.running_var", "layer2.0.bns.2.weight", "layer2.0.bns.2.bias", "layer2.0.bns.2.running_mean", "layer2.0.bns.2.running_var", "layer2.1.convs.0.weight", "layer2.1.convs.1.weight", "layer2.1.convs.2.weight", "layer2.1.bns.0.weight", "layer2.1.bns.0.bias", "layer2.1.bns.0.running_mean", "layer2.1.bns.0.running_var", "layer2.1.bns.1.weight", "layer2.1.bns.1.bias", "layer2.1.bns.1.running_mean", "layer2.1.bns.1.running_var", "layer2.1.bns.2.weight", "layer2.1.bns.2.bias", "layer2.1.bns.2.running_mean", "layer2.1.bns.2.running_var", "layer2.2.convs.0.weight", "layer2.2.convs.1.weight", "layer2.2.convs.2.weight", "layer2.2.bns.0.weight", "layer2.2.bns.0.bias", "layer2.2.bns.0.running_mean", "layer2.2.bns.0.running_var", "layer2.2.bns.1.weight", "layer2.2.bns.1.bias", "layer2.2.bns.1.running_mean", "layer2.2.bns.1.running_var", "layer2.2.bns.2.weight", "layer2.2.bns.2.bias", "layer2.2.bns.2.running_mean", "layer2.2.bns.2.running_var", "layer2.3.convs.0.weight", "layer2.3.convs.1.weight", "layer2.3.convs.2.weight", "layer2.3.bns.0.weight", "layer2.3.bns.0.bias", "layer2.3.bns.0.running_mean", "layer2.3.bns.0.running_var", "layer2.3.bns.1.weight", "layer2.3.bns.1.bias", "layer2.3.bns.1.running_mean", "layer2.3.bns.1.running_var", "layer2.3.bns.2.weight", "layer2.3.bns.2.bias", "layer2.3.bns.2.running_mean", "layer2.3.bns.2.running_var", "layer3.0.convs.0.weight", "layer3.0.convs.1.weight", "layer3.0.convs.2.weight", "layer3.0.bns.0.weight", "layer3.0.bns.0.bias", "layer3.0.bns.0.running_mean", "layer3.0.bns.0.running_var", "layer3.0.bns.1.weight", "layer3.0.bns.1.bias", "layer3.0.bns.1.running_mean", "layer3.0.bns.1.running_var", "layer3.0.bns.2.weight", "layer3.0.bns.2.bias", "layer3.0.bns.2.running_mean", "layer3.0.bns.2.running_var", "layer3.1.convs.0.weight", "layer3.1.convs.1.weight", "layer3.1.convs.2.weight", "layer3.1.bns.0.weight", "layer3.1.bns.0.bias", "layer3.1.bns.0.running_mean", "layer3.1.bns.0.running_var", "layer3.1.bns.1.weight", "layer3.1.bns.1.bias", "layer3.1.bns.1.running_mean", "layer3.1.bns.1.running_var", "layer3.1.bns.2.weight", "layer3.1.bns.2.bias", "layer3.1.bns.2.running_mean", "layer3.1.bns.2.running_var", "layer3.2.convs.0.weight", "layer3.2.convs.1.weight", "layer3.2.convs.2.weight", "layer3.2.bns.0.weight", "layer3.2.bns.0.bias", "layer3.2.bns.0.running_mean", "layer3.2.bns.0.running_var", "layer3.2.bns.1.weight", "layer3.2.bns.1.bias", "layer3.2.bns.1.running_mean", "layer3.2.bns.1.running_var", "layer3.2.bns.2.weight", "layer3.2.bns.2.bias", "layer3.2.bns.2.running_mean", "layer3.2.bns.2.running_var", "layer3.3.convs.0.weight", "layer3.3.convs.1.weight", "layer3.3.convs.2.weight", "layer3.3.bns.0.weight", "layer3.3.bns.0.bias", "layer3.3.bns.0.running_mean", "layer3.3.bns.0.running_var", "layer3.3.bns.1.weight", "layer3.3.bns.1.bias", "layer3.3.bns.1.running_mean", "layer3.3.bns.1.running_var", "layer3.3.bns.2.weight", "layer3.3.bns.2.bias", "layer3.3.bns.2.running_mean", "layer3.3.bns.2.running_var", "layer3.4.convs.0.weight", "layer3.4.convs.1.weight", "layer3.4.convs.2.weight", "layer3.4.bns.0.weight", "layer3.4.bns.0.bias", "layer3.4.bns.0.running_mean", "layer3.4.bns.0.running_var", "layer3.4.bns.1.weight", "layer3.4.bns.1.bias", "layer3.4.bns.1.running_mean", "layer3.4.bns.1.running_var", "layer3.4.bns.2.weight", "layer3.4.bns.2.bias", "layer3.4.bns.2.running_mean", "layer3.4.bns.2.running_var", "layer3.5.convs.0.weight", "layer3.5.convs.1.weight", "layer3.5.convs.2.weight", "layer3.5.bns.0.weight", "layer3.5.bns.0.bias", "layer3.5.bns.0.running_mean", "layer3.5.bns.0.running_var", "layer3.5.bns.1.weight", "layer3.5.bns.1.bias", "layer3.5.bns.1.running_mean", "layer3.5.bns.1.running_var", "layer3.5.bns.2.weight", "layer3.5.bns.2.bias", "layer3.5.bns.2.running_mean", "layer3.5.bns.2.running_var", "layer3.6.convs.0.weight", "layer3.6.convs.1.weight", "layer3.6.convs.2.weight", "layer3.6.bns.0.weight", "layer3.6.bns.0.bias", "layer3.6.bns.0.running_mean", "layer3.6.bns.0.running_var", "layer3.6.bns.1.weight", "layer3.6.bns.1.bias", "layer3.6.bns.1.running_mean", "layer3.6.bns.1.running_var", "layer3.6.bns.2.weight", "layer3.6.bns.2.bias", "layer3.6.bns.2.running_mean", "layer3.6.bns.2.running_var", "layer3.7.convs.0.weight", "layer3.7.convs.1.weight", "layer3.7.convs.2.weight", "layer3.7.bns.0.weight", "layer3.7.bns.0.bias", "layer3.7.bns.0.running_mean", "layer3.7.bns.0.running_var", "layer3.7.bns.1.weight", "layer3.7.bns.1.bias", "layer3.7.bns.1.running_mean", "layer3.7.bns.1.running_var", "layer3.7.bns.2.weight", "layer3.7.bns.2.bias", "layer3.7.bns.2.running_mean", "layer3.7.bns.2.running_var", "layer3.8.convs.0.weight", "layer3.8.convs.1.weight", "layer3.8.convs.2.weight", "layer3.8.bns.0.weight", "layer3.8.bns.0.bias", "layer3.8.bns.0.running_mean", "layer3.8.bns.0.running_var", "layer3.8.bns.1.weight", "layer3.8.bns.1.bias", "layer3.8.bns.1.running_mean", "layer3.8.bns.1.running_var", "layer3.8.bns.2.weight", "layer3.8.bns.2.bias", "layer3.8.bns.2.running_mean", "layer3.8.bns.2.running_var", "layer3.9.convs.0.weight", "layer3.9.convs.1.weight", "layer3.9.convs.2.weight", "layer3.9.bns.0.weight", "layer3.9.bns.0.bias", "layer3.9.bns.0.running_mean", "layer3.9.bns.0.running_var", "layer3.9.bns.1.weight", "layer3.9.bns.1.bias", "layer3.9.bns.1.running_mean", "layer3.9.bns.1.running_var", "layer3.9.bns.2.weight", "layer3.9.bns.2.bias", "layer3.9.bns.2.running_mean", "layer3.9.bns.2.running_var", "layer3.10.convs.0.weight", "layer3.10.convs.1.weight", "layer3.10.convs.2.weight", "layer3.10.bns.0.weight", "layer3.10.bns.0.bias", "layer3.10.bns.0.running_mean", "layer3.10.bns.0.running_var", "layer3.10.bns.1.weight", "layer3.10.bns.1.bias", "layer3.10.bns.1.running_mean", "layer3.10.bns.1.running_var", "layer3.10.bns.2.weight", "layer3.10.bns.2.bias", "layer3.10.bns.2.running_mean", "layer3.10.bns.2.running_var", "layer3.11.convs.0.weight", "layer3.11.convs.1.weight", "layer3.11.convs.2.weight", "layer3.11.bns.0.weight", "layer3.11.bns.0.bias", "layer3.11.bns.0.running_mean", "layer3.11.bns.0.running_var", "layer3.11.bns.1.weight", "layer3.11.bns.1.bias", "layer3.11.bns.1.running_mean", "layer3.11.bns.1.running_var", "layer3.11.bns.2.weight", "layer3.11.bns.2.bias", "layer3.11.bns.2.running_mean", "layer3.11.bns.2.running_var", "layer3.12.convs.0.weight", "layer3.12.convs.1.weight", "layer3.12.convs.2.weight", "layer3.12.bns.0.weight", "layer3.12.bns.0.bias", "layer3.12.bns.0.running_mean", "layer3.12.bns.0.running_var", "layer3.12.bns.1.weight", "layer3.12.bns.1.bias", "layer3.12.bns.1.running_mean", "layer3.12.bns.1.running_var", "layer3.12.bns.2.weight", "layer3.12.bns.2.bias", "layer3.12.bns.2.running_mean", "layer3.12.bns.2.running_var", "layer3.13.convs.0.weight", "layer3.13.convs.1.weight", "layer3.13.convs.2.weight", "layer3.13.bns.0.weight", "layer3.13.bns.0.bias", "layer3.13.bns.0.running_mean", "layer3.13.bns.0.running_var", "layer3.13.bns.1.weight", "layer3.13.bns.1.bias", "layer3.13.bns.1.running_mean", "layer3.13.bns.1.running_var", "layer3.13.bns.2.weight", "layer3.13.bns.2.bias", "layer3.13.bns.2.running_mean", "layer3.13.bns.2.running_var", "layer3.14.convs.0.weight", "layer3.14.convs.1.weight", "layer3.14.convs.2.weight", "layer3.14.bns.0.weight", "layer3.14.bns.0.bias", "layer3.14.bns.0.running_mean", "layer3.14.bns.0.running_var", "layer3.14.bns.1.weight", "layer3.14.bns.1.bias", "layer3.14.bns.1.running_mean", "layer3.14.bns.1.running_var", "layer3.14.bns.2.weight", "layer3.14.bns.2.bias", "layer3.14.bns.2.running_mean", "layer3.14.bns.2.running_var", "layer3.15.convs.0.weight", "layer3.15.convs.1.weight", "layer3.15.convs.2.weight", "layer3.15.bns.0.weight", "layer3.15.bns.0.bias", "layer3.15.bns.0.running_mean", "layer3.15.bns.0.running_var", "layer3.15.bns.1.weight", "layer3.15.bns.1.bias", "layer3.15.bns.1.running_mean", "layer3.15.bns.1.running_var", "layer3.15.bns.2.weight", "layer3.15.bns.2.bias", "layer3.15.bns.2.running_mean", "layer3.15.bns.2.running_var", "layer3.16.convs.0.weight", "layer3.16.convs.1.weight", "layer3.16.convs.2.weight", "layer3.16.bns.0.weight", "layer3.16.bns.0.bias", "layer3.16.bns.0.running_mean", "layer3.16.bns.0.running_var", "layer3.16.bns.1.weight", "layer3.16.bns.1.bias", "layer3.16.bns.1.running_mean", "layer3.16.bns.1.running_var", "layer3.16.bns.2.weight", "layer3.16.bns.2.bias", "layer3.16.bns.2.running_mean", "layer3.16.bns.2.running_var", "layer3.17.convs.0.weight", "layer3.17.convs.1.weight", "layer3.17.convs.2.weight", "layer3.17.bns.0.weight", "layer3.17.bns.0.bias", "layer3.17.bns.0.running_mean", "layer3.17.bns.0.running_var", "layer3.17.bns.1.weight", "layer3.17.bns.1.bias", "layer3.17.bns.1.running_mean", "layer3.17.bns.1.running_var", "layer3.17.bns.2.weight", "layer3.17.bns.2.bias", "layer3.17.bns.2.running_mean", "layer3.17.bns.2.running_var", "layer3.18.convs.0.weight", "layer3.18.convs.1.weight", "layer3.18.convs.2.weight", "layer3.18.bns.0.weight", "layer3.18.bns.0.bias", "layer3.18.bns.0.running_mean", "layer3.18.bns.0.running_var", "layer3.18.bns.1.weight", "layer3.18.bns.1.bias", "layer3.18.bns.1.running_mean", "layer3.18.bns.1.running_var", "layer3.18.bns.2.weight", "layer3.18.bns.2.bias", "layer3.18.bns.2.running_mean", "layer3.18.bns.2.running_var", "layer3.19.convs.0.weight", "layer3.19.convs.1.weight", "layer3.19.convs.2.weight", "layer3.19.bns.0.weight", "layer3.19.bns.0.bias", "layer3.19.bns.0.running_mean", "layer3.19.bns.0.running_var", "layer3.19.bns.1.weight", "layer3.19.bns.1.bias", "layer3.19.bns.1.running_mean", "layer3.19.bns.1.running_var", "layer3.19.bns.2.weight", "layer3.19.bns.2.bias", "layer3.19.bns.2.running_mean", "layer3.19.bns.2.running_var", "layer3.20.convs.0.weight", "layer3.20.convs.1.weight", "layer3.20.convs.2.weight", "layer3.20.bns.0.weight", "layer3.20.bns.0.bias", "layer3.20.bns.0.running_mean", "layer3.20.bns.0.running_var", "layer3.20.bns.1.weight", "layer3.20.bns.1.bias", "layer3.20.bns.1.running_mean", "layer3.20.bns.1.running_var", "layer3.20.bns.2.weight", "layer3.20.bns.2.bias", "layer3.20.bns.2.running_mean", "layer3.20.bns.2.running_var", "layer3.21.convs.0.weight", "layer3.21.convs.1.weight", "layer3.21.convs.2.weight", "layer3.21.bns.0.weight", "layer3.21.bns.0.bias", "layer3.21.bns.0.running_mean", "layer3.21.bns.0.running_var", "layer3.21.bns.1.weight", "layer3.21.bns.1.bias", "layer3.21.bns.1.running_mean", "layer3.21.bns.1.running_var", "layer3.21.bns.2.weight", "layer3.21.bns.2.bias", "layer3.21.bns.2.running_mean", "layer3.21.bns.2.running_var", "layer3.22.convs.0.weight", "layer3.22.convs.1.weight", "layer3.22.convs.2.weight", "layer3.22.bns.0.weight", "layer3.22.bns.0.bias", "layer3.22.bns.0.running_mean", "layer3.22.bns.0.running_var", "layer3.22.bns.1.weight", "layer3.22.bns.1.bias", "layer3.22.bns.1.running_mean", "layer3.22.bns.1.running_var", "layer3.22.bns.2.weight", "layer3.22.bns.2.bias", "layer3.22.bns.2.running_mean", "layer3.22.bns.2.running_var", "layer4.0.convs.0.weight", "layer4.0.convs.1.weight", "layer4.0.convs.2.weight", "layer4.0.bns.0.weight", "layer4.0.bns.0.bias", "layer4.0.bns.0.running_mean", "layer4.0.bns.0.running_var", "layer4.0.bns.1.weight", "layer4.0.bns.1.bias", "layer4.0.bns.1.running_mean", "layer4.0.bns.1.running_var", "layer4.0.bns.2.weight", "layer4.0.bns.2.bias", "layer4.0.bns.2.running_mean", "layer4.0.bns.2.running_var", "layer4.1.convs.0.weight", "layer4.1.convs.1.weight", "layer4.1.convs.2.weight", "layer4.1.bns.0.weight", "layer4.1.bns.0.bias", "layer4.1.bns.0.running_mean", "layer4.1.bns.0.running_var", "layer4.1.bns.1.weight", "layer4.1.bns.1.bias", "layer4.1.bns.1.running_mean", "layer4.1.bns.1.running_var", "layer4.1.bns.2.weight", "layer4.1.bns.2.bias", "layer4.1.bns.2.running_mean", "layer4.1.bns.2.running_var", "layer4.2.convs.0.weight", "layer4.2.convs.1.weight", "layer4.2.convs.2.weight", "layer4.2.bns.0.weight", "layer4.2.bns.0.bias", "layer4.2.bns.0.running_mean", "layer4.2.bns.0.running_var", "layer4.2.bns.1.weight", "layer4.2.bns.1.bias", "layer4.2.bns.1.running_mean", "layer4.2.bns.1.running_var", "layer4.2.bns.2.weight", "layer4.2.bns.2.bias", "layer4.2.bns.2.running_mean", "layer4.2.bns.2.running_var". Unexpected key(s) in state_dict: "layer1.0.conv2.weight", "layer1.0.bn2.running_mean", "layer1.0.bn2.running_var", "layer1.0.bn2.weight", "layer1.0.bn2.bias", "layer1.1.conv2.weight", "layer1.1.bn2.running_mean", "layer1.1.bn2.running_var", "layer1.1.bn2.weight", "layer1.1.bn2.bias", "layer1.2.conv2.weight", "layer1.2.bn2.running_mean", "layer1.2.bn2.running_var", "layer1.2.bn2.weight", "layer1.2.bn2.bias", "layer2.0.conv2.weight", "layer2.0.bn2.running_mean", "layer2.0.bn2.running_var", "layer2.0.bn2.weight", "layer2.0.bn2.bias", "layer2.1.conv2.weight", "layer2.1.bn2.running_mean", "layer2.1.bn2.running_var", "layer2.1.bn2.weight", "layer2.1.bn2.bias", "layer2.2.conv2.weight", "layer2.2.bn2.running_mean", "layer2.2.bn2.running_var", "layer2.2.bn2.weight", "layer2.2.bn2.bias", "layer2.3.conv2.weight", "layer2.3.bn2.running_mean", "layer2.3.bn2.running_var", "layer2.3.bn2.weight", "layer2.3.bn2.bias", "layer3.0.conv2.weight", "layer3.0.bn2.running_mean", "layer3.0.bn2.running_var", "layer3.0.bn2.weight", "layer3.0.bn2.bias", "layer3.1.conv2.weight", "layer3.1.bn2.running_mean", "layer3.1.bn2.running_var", "layer3.1.bn2.weight", "layer3.1.bn2.bias", "layer3.2.conv2.weight", "layer3.2.bn2.running_mean", "layer3.2.bn2.running_var", "layer3.2.bn2.weight", "layer3.2.bn2.bias", "layer3.3.conv2.weight", "layer3.3.bn2.running_mean", "layer3.3.bn2.running_var", "layer3.3.bn2.weight", "layer3.3.bn2.bias", "layer3.4.conv2.weight", "layer3.4.bn2.running_mean", "layer3.4.bn2.running_var", "layer3.4.bn2.weight", "layer3.4.bn2.bias", "layer3.5.conv2.weight", "layer3.5.bn2.running_mean", "layer3.5.bn2.running_var", "layer3.5.bn2.weight", "layer3.5.bn2.bias", "layer3.6.conv2.weight", "layer3.6.bn2.running_mean", "layer3.6.bn2.running_var", "layer3.6.bn2.weight", "layer3.6.bn2.bias", "layer3.7.conv2.weight", "layer3.7.bn2.running_mean", "layer3.7.bn2.running_var", "layer3.7.bn2.weight", "layer3.7.bn2.bias", "layer3.8.conv2.weight", "layer3.8.bn2.running_mean", "layer3.8.bn2.running_var", "layer3.8.bn2.weight", "layer3.8.bn2.bias", "layer3.9.conv2.weight", "layer3.9.bn2.running_mean", "layer3.9.bn2.running_var", "layer3.9.bn2.weight", "layer3.9.bn2.bias", "layer3.10.conv2.weight", "layer3.10.bn2.running_mean", "layer3.10.bn2.running_var", "layer3.10.bn2.weight", "layer3.10.bn2.bias", "layer3.11.conv2.weight", "layer3.11.bn2.running_mean", "layer3.11.bn2.running_var", "layer3.11.bn2.weight", "layer3.11.bn2.bias", "layer3.12.conv2.weight", "layer3.12.bn2.running_mean", "layer3.12.bn2.running_var", "layer3.12.bn2.weight", "layer3.12.bn2.bias", "layer3.13.conv2.weight", "layer3.13.bn2.running_mean", "layer3.13.bn2.running_var", "layer3.13.bn2.weight", "layer3.13.bn2.bias", "layer3.14.conv2.weight", "layer3.14.bn2.running_mean", "layer3.14.bn2.running_var", "layer3.14.bn2.weight", "layer3.14.bn2.bias", "layer3.15.conv2.weight", "layer3.15.bn2.running_mean", "layer3.15.bn2.running_var", "layer3.15.bn2.weight", "layer3.15.bn2.bias", "layer3.16.conv2.weight", "layer3.16.bn2.running_mean", "layer3.16.bn2.running_var", "layer3.16.bn2.weight", "layer3.16.bn2.bias", "layer3.17.conv2.weight", "layer3.17.bn2.running_mean", "layer3.17.bn2.running_var", "layer3.17.bn2.weight", "layer3.17.bn2.bias", "layer3.18.conv2.weight", "layer3.18.bn2.running_mean", "layer3.18.bn2.running_var", "layer3.18.bn2.weight", "layer3.18.bn2.bias", "layer3.19.conv2.weight", "layer3.19.bn2.running_mean", "layer3.19.bn2.running_var", "layer3.19.bn2.weight", "layer3.19.bn2.bias", "layer3.20.conv2.weight", "layer3.20.bn2.running_mean", "layer3.20.bn2.running_var", "layer3.20.bn2.weight", "layer3.20.bn2.bias", "layer3.21.conv2.weight", "layer3.21.bn2.running_mean", "layer3.21.bn2.running_var", "layer3.21.bn2.weight", "layer3.21.bn2.bias", "layer3.22.conv2.weight", "layer3.22.bn2.running_mean", "layer3.22.bn2.running_var", "layer3.22.bn2.weight", "layer3.22.bn2.bias", "layer4.0.conv2.weight", "layer4.0.bn2.running_mean", "layer4.0.bn2.running_var", "layer4.0.bn2.weight", "layer4.0.bn2.bias", "layer4.1.conv2.weight", "layer4.1.bn2.running_mean", "layer4.1.bn2.running_var", "layer4.1.bn2.weight", "layer4.1.bn2.bias", "layer4.2.conv2.weight", "layer4.2.bn2.running_mean", "layer4.2.bn2.running_var", "layer4.2.bn2.weight", "layer4.2.bn2.bias". size mismatch for layer1.0.conv1.weight: copying a param with shape torch.Size([64, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([104, 64, 1, 1]). size mismatch for layer1.0.bn1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.0.bn1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.0.bn1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.0.bn1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.0.conv3.weight: copying a param with shape torch.Size([256, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 104, 1, 1]). size mismatch for layer1.1.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([104, 256, 1, 1]). size mismatch for layer1.1.bn1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.1.bn1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.1.bn1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.1.bn1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.1.conv3.weight: copying a param with shape torch.Size([256, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 104, 1, 1]). size mismatch for layer1.2.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([104, 256, 1, 1]). size mismatch for layer1.2.bn1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.2.bn1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.2.bn1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.2.bn1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([104]). size mismatch for layer1.2.conv3.weight: copying a param with shape torch.Size([256, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 104, 1, 1]). size mismatch for layer2.0.conv1.weight: copying a param with shape torch.Size([128, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([208, 256, 1, 1]). size mismatch for layer2.0.bn1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.0.bn1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.0.bn1.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.0.bn1.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.0.conv3.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 208, 1, 1]). size mismatch for layer2.1.conv1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([208, 512, 1, 1]). size mismatch for layer2.1.bn1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.1.bn1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.1.bn1.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.1.bn1.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.1.conv3.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 208, 1, 1]). size mismatch for layer2.2.conv1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([208, 512, 1, 1]). size mismatch for layer2.2.bn1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.2.bn1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.2.bn1.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.2.bn1.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.2.conv3.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 208, 1, 1]). size mismatch for layer2.3.conv1.weight: copying a param with shape torch.Size([128, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([208, 512, 1, 1]). size mismatch for layer2.3.bn1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.3.bn1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.3.bn1.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.3.bn1.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([208]). size mismatch for layer2.3.conv3.weight: copying a param with shape torch.Size([512, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 208, 1, 1]). size mismatch for layer3.0.conv1.weight: copying a param with shape torch.Size([256, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 512, 1, 1]). size mismatch for layer3.0.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.0.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.0.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.0.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.0.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.1.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.1.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.1.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.1.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.1.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.1.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.2.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.2.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.2.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.2.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.2.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.2.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.3.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.3.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.3.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.3.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.3.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.3.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.4.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.4.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.4.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.4.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.4.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.4.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.5.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.5.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.5.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.5.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.5.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.5.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.6.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.6.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.6.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.6.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.6.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.6.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.7.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.7.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.7.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.7.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.7.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.7.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.8.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.8.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.8.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.8.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.8.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.8.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.9.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.9.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.9.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.9.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.9.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.9.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.10.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.10.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.10.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.10.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.10.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.10.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.11.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.11.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.11.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.11.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.11.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.11.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.12.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.12.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.12.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.12.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.12.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.12.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.13.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.13.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.13.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.13.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.13.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.13.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.14.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.14.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.14.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.14.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.14.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.14.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.15.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.15.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.15.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.15.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.15.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.15.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.16.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.16.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.16.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.16.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.16.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.16.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.17.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.17.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.17.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.17.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.17.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.17.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.18.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.18.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.18.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.18.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.18.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.18.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.19.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.19.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.19.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.19.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.19.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.19.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.20.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.20.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.20.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.20.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.20.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.20.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.21.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.21.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.21.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.21.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.21.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.21.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer3.22.conv1.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([416, 1024, 1, 1]). size mismatch for layer3.22.bn1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.22.bn1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.22.bn1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.22.bn1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([416]). size mismatch for layer3.22.conv3.weight: copying a param with shape torch.Size([1024, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 416, 1, 1]). size mismatch for layer4.0.conv1.weight: copying a param with shape torch.Size([512, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([832, 1024, 1, 1]). size mismatch for layer4.0.bn1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.0.bn1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.0.bn1.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.0.bn1.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.0.conv3.weight: copying a param with shape torch.Size([2048, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([2048, 832, 1, 1]). size mismatch for layer4.1.conv1.weight: copying a param with shape torch.Size([512, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([832, 2048, 1, 1]). size mismatch for layer4.1.bn1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.1.bn1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.1.bn1.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.1.bn1.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.1.conv3.weight: copying a param with shape torch.Size([2048, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([2048, 832, 1, 1]). size mismatch for layer4.2.conv1.weight: copying a param with shape torch.Size([512, 2048, 1, 1]) from checkpoint, the shape in current model is torch.Size([832, 2048, 1, 1]). size mismatch for layer4.2.bn1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.2.bn1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.2.bn1.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.2.bn1.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([832]). size mismatch for layer4.2.conv3.weight: #copying a param with shape torch.Size([2048, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([2048, 832, 1, 1]).

hi . i have the same issue,canyou reslove it?

Please make sure that you use the Res2Net instead of ResNet. Provide me with the code to reproduce this error can help to solve this problem.

i have the same problem.but i have solve it. you don't change the .yaml file.one word,it work. i add TEST_PERIOD: 10000 model.it don't work.I am thinking the problem.if you solve ,please contact me.

I am confused. We don't have any yaml file in this repo.

https://github.com/Res2Net/Res2Net-maskrcnn/blob/master/configs/pytorch_mask_rcnn_R2_50_s4_FPN_2x.yaml

That is the res2net-maskrcnn repo. What do you mean "i add TEST_PERIOD: 10000 model.it don't work"? I don't quite understand your problem. You can contact me with email in Chinese if you like.

我最终发现是犯了很愚蠢的错误 MODEL: META_ARCHITECTURE: "GeneralizedRCNN" WEIGHT: "/root/data/res2net50_26w_4s-06e79181.pth" BACKBONE: CONV_BODY: "R2-50-FPN" RESNETS: BACKBONE_OUT_CHANNELS: 256 WIDTH_PER_GROUP: 26 SCALE: 8 TRANS_FUNC: "Bottle2neckWithFixedBatchNorm"

其中width，scale要与res2net50_26w_4s-06e79181.pth保持一致。但是在测试res2net50_v1b_26w_4s-3cf99910.pth 并更改width，scale时，会发生错误 Traceback (most recent call last): File "tools/train_net.py", line 178, in main() File "tools/train_net.py", line 171, in main model = train(cfg, args.local_rank, args.distributed) File "tools/train_net.py", line 56, in train extra_checkpoint_data = checkpointer.load(cfg.MODEL.WEIGHT) File "/root/data/Res2Net-maskrcnn/maskrcnn_benchmark/utils/checkpoint.py", line 62, in load self._load_model(checkpoint) File "/root/data/Res2Net-maskrcnn/maskrcnn_benchmark/utils/checkpoint.py", line 98, in _load_model load_state_dict(self.model, checkpoint.pop("model")) File "/root/data/Res2Net-maskrcnn/maskrcnn_benchmark/utils/model_serialization.py", line 80, in load_state_dict model.load_state_dict(model_state_dict) File "/root/anaconda3/envs/maskrcnn/lib/python3.7/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for GeneralizedRCNN: size mismatch for backbone.body.layer1.0.downsample.1.weight: copying a param with shape torch.Size([256, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for backbone.body.layer2.0.downsample.1.weight: copying a param with shape torch.Size([512, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for backbone.body.layer3.0.downsample.1.weight: copying a param with shape torch.Size([1024, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for backbone.body.layer4.0.downsample.1.weight: copying a param with shape torch.Size([2048, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([2048]).

Res2Net / Res2Net-PretrainedModels

Loading State Dict #4