zhanghang1989 / PyTorch-Encoding

A CV toolkit for my papers.
https://hangzhang.org/PyTorch-Encoding/
MIT License
2.04k stars 450 forks source link

train_dist #401

Open soroushmbk opened 3 years ago

soroushmbk commented 3 years ago

hi I trained atten model and I used train_dist file for the training when I want to test the model using the trained atten model some errors occurred: can u help me? untimeError: Error(s) in loading state_dict for ATTEN: Missing key(s) in state_dict: "pretrained.layer1.0.conv2.weight", "pretrained.layer1.0.bn2.weight", "pretrained.layer1.0.bn2.bias", "pretrained.layer1.0.bn2.running_mean", "pretrained.layer1.0.bn2.running_var", "pretrained.layer1.0.downsample.0.weight", "pretrained.layer1.0.downsample.1.bias", "pretrained.layer1.0.downsample.1.running_mean", "pretrained.layer1.0.downsample.1.running_var", "pretrained.layer1.1.conv2.weight", "pretrained.layer1.1.bn2.weight", "pretrained.layer1.1.bn2.bias", "pretrained.layer1.1.bn2.running_mean", "pretrained.layer1.1.bn2.running_var", "pretrained.layer1.2.conv2.weight", "pretrained.layer1.2.bn2.weight", "pretrained.layer1.2.bn2.bias", "pretrained.layer1.2.bn2.running_mean", "pretrained.layer1.2.bn2.running_var", "pretrained.layer2.0.conv2.weight", "pretrained.layer2.0.bn2.weight", "pretrained.layer2.0.bn2.bias", "pretrained.layer2.0.bn2.running_mean", "pretrained.layer2.0.bn2.running_var", "pretrained.layer2.0.downsample.0.weight", "pretrained.layer2.0.downsample.1.bias", "pretrained.layer2.0.downsample.1.running_mean", "pretrained.layer2.0.downsample.1.running_var", "pretrained.layer2.1.conv2.weight", "pretrained.layer2.1.bn2.weight", "pretrained.layer2.1.bn2.bias", "pretrained.layer2.1.bn2.running_mean", "pretrained.layer2.1.bn2.running_var", "pretrained.layer2.2.conv2.weight", "pretrained.layer2.2.bn2.weight", "pretrained.layer2.2.bn2.bias", "pretrained.layer2.2.bn2.running_mean", "pretrained.layer2.2.bn2.running_var", "pretrained.layer2.3.conv2.weight", "pretrained.layer2.3.bn2.weight", "pretrained.layer2.3.bn2.bias", "pretrained.layer2.3.bn2.running_mean", "pretrained.layer2.3.bn2.running_var", "pretrained.layer3.0.conv2.weight", "pretrained.layer3.0.bn2.weight", "pretrained.layer3.0.bn2.bias", "pretrained.layer3.0.bn2.running_mean", "pretrained.layer3.0.bn2.running_var", "pretrained.layer3.0.downsample.0.weight", "pretrained.layer3.0.downsample.1.bias", "pretrained.layer3.0.downsample.1.running_mean", "pretrained.layer3.0.downsample.1.running_var", "pretrained.layer3.1.conv2.weight", "pretrained.layer3.1.bn2.weight", "pretrained.layer3.1.bn2.bias", "pretrained.layer3.1.bn2.running_mean", "pretrained.layer3.1.bn2.running_var", "pretrained.layer3.2.conv2.weight", "pretrained.layer3.2.bn2.weight", "pretrained.layer3.2.bn2.bias", "pretrained.layer3.2.bn2.running_mean", "pretrained.layer3.2.bn2.running_var", "pretrained.layer3.3.conv2.weight", "pretrained.layer3.3.bn2.weight", "pretrained.layer3.3.bn2.bias", "pretrained.layer3.3.bn2.running_mean", "pretrained.layer3.3.bn2.running_var", "pretrained.layer3.4.conv2.weight", "pretrained.layer3.4.bn2.weight", "pretrained.layer3.4.bn2.bias", "pretrained.layer3.4.bn2.running_mean", "pretrained.layer3.4.bn2.running_var", "pretrained.layer3.5.conv2.weight", "pretrained.layer3.5.bn2.weight", "pretrained.layer3.5.bn2.bias", "pretrained.layer3.5.bn2.running_mean", "pretrained.layer3.5.bn2.running_var", "pretrained.layer4.0.conv2.weight", "pretrained.layer4.0.bn2.weight", "pretrained.layer4.0.bn2.bias", "pretrained.layer4.0.bn2.running_mean", "pretrained.layer4.0.bn2.running_var", "pretrained.layer4.0.downsample.0.weight", "pretrained.layer4.0.downsample.1.bias", "pretrained.layer4.0.downsample.1.running_mean", "pretrained.layer4.0.downsample.1.running_var", "pretrained.layer4.1.conv2.weight", "pretrained.layer4.1.bn2.weight", "pretrained.layer4.1.bn2.bias", "pretrained.layer4.1.bn2.running_mean", "pretrained.layer4.1.bn2.running_var", "pretrained.layer4.2.conv2.weight", "pretrained.layer4.2.bn2.weight", "pretrained.layer4.2.bn2.bias", "pretrained.layer4.2.bn2.running_mean", "pretrained.layer4.2.bn2.running_var". Unexpected key(s) in state_dict: "pretrained.layer1.0.conv2.conv.weight", "pretrained.layer1.0.conv2.bn0.weight", "pretrained.layer1.0.conv2.bn0.bias", "pretrained.layer1.0.conv2.bn0.running_mean", "pretrained.layer1.0.conv2.bn0.running_var", "pretrained.layer1.0.conv2.bn0.num_batches_tracked", "pretrained.layer1.0.conv2.fc1.weight", "pretrained.layer1.0.conv2.fc1.bias", "pretrained.layer1.0.conv2.bn1.weight", "pretrained.layer1.0.conv2.bn1.bias", "pretrained.layer1.0.conv2.bn1.running_mean", "pretrained.layer1.0.conv2.bn1.running_var", "pretrained.layer1.0.conv2.bn1.num_batches_tracked", "pretrained.layer1.0.conv2.fc2.weight", "pretrained.layer1.0.conv2.fc2.bias", "pretrained.layer1.0.downsample.2.weight", "pretrained.layer1.0.downsample.2.bias", "pretrained.layer1.0.downsample.2.running_mean", "pretrained.layer1.0.downsample.2.running_var", "pretrained.layer1.0.downsample.2.num_batches_tracked", "pretrained.layer1.1.conv2.conv.weight", "pretrained.layer1.1.conv2.bn0.weight", "pretrained.layer1.1.conv2.bn0.bias", "pretrained.layer1.1.conv2.bn0.running_mean", "pretrained.layer1.1.conv2.bn0.running_var", "pretrained.layer1.1.conv2.bn0.num_batches_tracked", "pretrained.layer1.1.conv2.fc1.weight", "pretrained.layer1.1.conv2.fc1.bias", "pretrained.layer1.1.conv2.bn1.weight", "pretrained.layer1.1.conv2.bn1.bias", "pretrained.layer1.1.conv2.bn1.running_mean", "pretrained.layer1.1.conv2.bn1.running_var", "pretrained.layer1.1.conv2.bn1.num_batches_tracked", "pretrained.layer1.1.conv2.fc2.weight", "pretrained.layer1.1.conv2.fc2.bias", "pretrained.layer1.2.conv2.conv.weight", "pretrained.layer1.2.conv2.bn0.weight", "pretrained.layer1.2.conv2.bn0.bias", "pretrained.layer1.2.conv2.bn0.running_mean", "pretrained.layer1.2.conv2.bn0.running_var", "pretrained.layer1.2.conv2.bn0.num_batches_tracked", "pretrained.layer1.2.conv2.fc1.weight", "pretrained.layer1.2.conv2.fc1.bias", "pretrained.layer1.2.conv2.bn1.weight", "pretrained.layer1.2.conv2.bn1.bias", "pretrained.layer1.2.conv2.bn1.running_mean", "pretrained.layer1.2.conv2.bn1.running_var", "pretrained.layer1.2.conv2.bn1.num_batches_tracked", "pretrained.layer1.2.conv2.fc2.weight", "pretrained.layer1.2.conv2.fc2.bias", "pretrained.layer2.0.conv2.conv.weight", "pretrained.layer2.0.conv2.bn0.weight", "pretrained.layer2.0.conv2.bn0.bias", "pretrained.layer2.0.conv2.bn0.running_mean", "pretrained.layer2.0.conv2.bn0.running_var", "pretrained.layer2.0.conv2.bn0.num_batches_tracked", "pretrained.layer2.0.conv2.fc1.weight", "pretrained.layer2.0.conv2.fc1.bias", "pretrained.layer2.0.conv2.bn1.weight", "pretrained.layer2.0.conv2.bn1.bias", "pretrained.layer2.0.conv2.bn1.running_mean", "pretrained.layer2.0.conv2.bn1.running_var", "pretrained.layer2.0.conv2.bn1.num_batches_tracked", "pretrained.layer2.0.conv2.fc2.weight", "pretrained.layer2.0.conv2.fc2.bias", "pretrained.layer2.0.downsample.2.weight", "pretrained.layer2.0.downsample.2.bias", "pretrained.layer2.0.downsample.2.running_mean", "pretrained.layer2.0.downsample.2.running_var", "pretrained.layer2.0.downsample.2.num_batches_tracked", "pretrained.layer2.1.conv2.conv.weight", "pretrained.layer2.1.conv2.bn0.weight", "pretrained.layer2.1.conv2.bn0.bias", "pretrained.layer2.1.conv2.bn0.running_mean", "pretrained.layer2.1.conv2.bn0.running_var", "pretrained.layer2.1.conv2.bn0.num_batches_tracked", "pretrained.layer2.1.conv2.fc1.weight", "pretrained.layer2.1.conv2.fc1.bias", "pretrained.layer2.1.conv2.bn1.weight", "pretrained.layer2.1.conv2.bn1.bias", "pretrained.layer2.1.conv2.bn1.running_mean", "pretrained.layer2.1.conv2.bn1.running_var", "pretrained.layer2.1.conv2.bn1.num_batches_tracked", "pretrained.layer2.1.conv2.fc2.weight", "pretrained.layer2.1.conv2.fc2.bias", "pretrained.layer2.2.conv2.conv.weight", "pretrained.layer2.2.conv2.bn0.weight", "pretrained.layer2.2.conv2.bn0.bias", "pretrained.layer2.2.conv2.bn0.running_mean", "pretrained.layer2.2.conv2.bn0.running_var", "pretrained.layer2.2.conv2.bn0.num_batches_tracked", "pretrained.layer2.2.conv2.fc1.weight", "pretrained.layer2.2.conv2.fc1.bias", "pretrained.layer2.2.conv2.bn1.weight", "pretrained.layer2.2.conv2.bn1.bias", "pretrained.layer2.2.conv2.bn1.running_mean", "pretrained.layer2.2.conv2.bn1.running_var", "pretrained.layer2.2.conv2.bn1.num_batches_tracked", "pretrained.layer2.2.conv2.fc2.weight", "pretrained.layer2.2.conv2.fc2.bias", "pretrained.layer2.3.conv2.conv.weight", "pretrained.layer2.3.conv2.bn0.weight", "pretrained.layer2.3.conv2.bn0.bias", "pretrained.layer2.3.conv2.bn0.running_mean", "pretrained.layer2.3.conv2.bn0.running_var", "pretrained.layer2.3.conv2.bn0.num_batches_tracked", "pretrained.layer2.3.conv2.fc1.weight", "pretrained.layer2.3.conv2.fc1.bias", "pretrained.layer2.3.conv2.bn1.weight", "pretrained.layer2.3.conv2.bn1.bias", "pretrained.layer2.3.conv2.bn1.running_mean", "pretrained.layer2.3.conv2.bn1.running_var", "pretrained.layer2.3.conv2.bn1.num_batches_tracked", "pretrained.layer2.3.conv2.fc2.weight", "pretrained.layer2.3.conv2.fc2.bias", "pretrained.layer3.0.conv2.conv.weight", "pretrained.layer3.0.conv2.bn0.weight", "pretrained.layer3.0.conv2.bn0.bias", "pretrained.layer3.0.conv2.bn0.running_mean", "pretrained.layer3.0.conv2.bn0.running_var", "pretrained.layer3.0.conv2.bn0.num_batches_tracked", "pretrained.layer3.0.conv2.fc1.weight", "pretrained.layer3.0.conv2.fc1.bias", "pretrained.layer3.0.conv2.bn1.weight", "pretrained.layer3.0.conv2.bn1.bias", "pretrained.layer3.0.conv2.bn1.running_mean", "pretrained.layer3.0.conv2.bn1.running_var", "pretrained.layer3.0.conv2.bn1.num_batches_tracked", "pretrained.layer3.0.conv2.fc2.weight", "pretrained.layer3.0.conv2.fc2.bias", "pretrained.layer3.0.downsample.2.weight", "pretrained.layer3.0.downsample.2.bias", "pretrained.layer3.0.downsample.2.running_mean", "pretrained.layer3.0.downsample.2.running_var", "pretrained.layer3.0.downsample.2.num_batches_tracked", "pretrained.layer3.1.conv2.conv.weight", "pretrained.layer3.1.conv2.bn0.weight", "pretrained.layer3.1.conv2.bn0.bias", "pretrained.layer3.1.conv2.bn0.running_mean", "pretrained.layer3.1.conv2.bn0.running_var", "pretrained.layer3.1.conv2.bn0.num_batches_tracked", "pretrained.layer3.1.conv2.fc1.weight", "pretrained.layer3.1.conv2.fc1.bias", "pretrained.layer3.1.conv2.bn1.weight", "pretrained.layer3.1.conv2.bn1.bias", "pretrained.layer3.1.conv2.bn1.running_mean", "pretrained.layer3.1.conv2.bn1.running_var", "pretrained.layer3.1.conv2.bn1.num_batches_tracked", "pretrained.layer3.1.conv2.fc2.weight", "pretrained.layer3.1.conv2.fc2.bias", "pretrained.layer3.2.conv2.conv.weight", "pretrained.layer3.2.conv2.bn0.weight", "pretrained.layer3.2.conv2.bn0.bias", "pretrained.layer3.2.conv2.bn0.running_mean", "pretrained.layer3.2.conv2.bn0.running_var", "pretrained.layer3.2.conv2.bn0.num_batches_tracked", "pretrained.layer3.2.conv2.fc1.weight", "pretrained.layer3.2.conv2.fc1.bias", "pretrained.layer3.2.conv2.bn1.weight", "pretrained.layer3.2.conv2.bn1.bias", "pretrained.layer3.2.conv2.bn1.running_mean", "pretrained.layer3.2.conv2.bn1.running_var", "pretrained.layer3.2.conv2.bn1.num_batches_tracked", "pretrained.layer3.2.conv2.fc2.weight", "pretrained.layer3.2.conv2.fc2.bias", "pretrained.layer3.3.conv2.conv.weight", "pretrained.layer3.3.conv2.bn0.weight", "pretrained.layer3.3.conv2.bn0.bias", "pretrained.layer3.3.conv2.bn0.running_mean", "pretrained.layer3.3.conv2.bn0.running_var", "pretrained.layer3.3.conv2.bn0.num_batches_tracked", "pretrained.layer3.3.conv2.fc1.weight", "pretrained.layer3.3.conv2.fc1.bias", "pretrained.layer3.3.conv2.bn1.weight", "pretrained.layer3.3.conv2.bn1.bias", "pretrained.layer3.3.conv2.bn1.running_mean", "pretrained.layer3.3.conv2.bn1.running_var", "pretrained.layer3.3.conv2.bn1.num_batches_tracked", "pretrained.layer3.3.conv2.fc2.weight", "pretrained.layer3.3.conv2.fc2.bias", "pretrained.layer3.4.conv2.conv.weight", "pretrained.layer3.4.conv2.bn0.weight", "pretrained.layer3.4.conv2.bn0.bias", "pretrained.layer3.4.conv2.bn0.running_mean", "pretrained.layer3.4.conv2.bn0.running_var", "pretrained.layer3.4.conv2.bn0.num_batches_tracked", "pretrained.layer3.4.conv2.fc1.weight", "pretrained.layer3.4.conv2.fc1.bias", "pretrained.layer3.4.conv2.bn1.weight", "pretrained.layer3.4.conv2.bn1.bias", "pretrained.layer3.4.conv2.bn1.running_mean", "pretrained.layer3.4.conv2.bn1.running_var", "pretrained.layer3.4.conv2.bn1.num_batches_tracked", "pretrained.layer3.4.conv2.fc2.weight", "pretrained.layer3.4.conv2.fc2.bias", "pretrained.layer3.5.conv2.conv.weight", "pretrained.layer3.5.conv2.bn0.weight", "pretrained.layer3.5.conv2.bn0.bias", "pretrained.layer3.5.conv2.bn0.running_mean", "pretrained.layer3.5.conv2.bn0.running_var", "pretrained.layer3.5.conv2.bn0.num_batches_tracked", "pretrained.layer3.5.conv2.fc1.weight", "pretrained.layer3.5.conv2.fc1.bias", "pretrained.layer3.5.conv2.bn1.weight", "pretrained.layer3.5.conv2.bn1.bias", "pretrained.layer3.5.conv2.bn1.running_mean", "pretrained.layer3.5.conv2.bn1.running_var", "pretrained.layer3.5.conv2.bn1.num_batches_tracked", "pretrained.layer3.5.conv2.fc2.weight", "pretrained.layer3.5.conv2.fc2.bias", "pretrained.layer4.0.conv2.conv.weight", "pretrained.layer4.0.conv2.bn0.weight", "pretrained.layer4.0.conv2.bn0.bias", "pretrained.layer4.0.conv2.bn0.running_mean", "pretrained.layer4.0.conv2.bn0.running_var", "pretrained.layer4.0.conv2.bn0.num_batches_tracked", "pretrained.layer4.0.conv2.fc1.weight", "pretrained.layer4.0.conv2.fc1.bias", "pretrained.layer4.0.conv2.bn1.weight", "pretrained.layer4.0.conv2.bn1.bias", "pretrained.layer4.0.conv2.bn1.running_mean", "pretrained.layer4.0.conv2.bn1.running_var", "pretrained.layer4.0.conv2.bn1.num_batches_tracked", "pretrained.layer4.0.conv2.fc2.weight", "pretrained.layer4.0.conv2.fc2.bias", "pretrained.layer4.0.downsample.2.weight", "pretrained.layer4.0.downsample.2.bias", "pretrained.layer4.0.downsample.2.running_mean", "pretrained.layer4.0.downsample.2.running_var", "pretrained.layer4.0.downsample.2.num_batches_tracked", "pretrained.layer4.1.conv2.conv.weight", "pretrained.layer4.1.conv2.bn0.weight", "pretrained.layer4.1.conv2.bn0.bias", "pretrained.layer4.1.conv2.bn0.running_mean", "pretrained.layer4.1.conv2.bn0.running_var", "pretrained.layer4.1.conv2.bn0.num_batches_tracked", "pretrained.layer4.1.conv2.fc1.weight", "pretrained.layer4.1.conv2.fc1.bias", "pretrained.layer4.1.conv2.bn1.weight", "pretrained.layer4.1.conv2.bn1.bias", "pretrained.layer4.1.conv2.bn1.running_mean", "pretrained.layer4.1.conv2.bn1.running_var", "pretrained.layer4.1.conv2.bn1.num_batches_tracked", "pretrained.layer4.1.conv2.fc2.weight", "pretrained.layer4.1.conv2.fc2.bias", "pretrained.layer4.2.conv2.conv.weight", "pretrained.layer4.2.conv2.bn0.weight", "pretrained.layer4.2.conv2.bn0.bias", "pretrained.layer4.2.conv2.bn0.running_mean", "pretrained.layer4.2.conv2.bn0.running_var", "pretrained.layer4.2.conv2.bn0.num_batches_tracked", "pretrained.layer4.2.conv2.fc1.weight", "pretrained.layer4.2.conv2.fc1.bias", "pretrained.layer4.2.conv2.bn1.weight", "pretrained.layer4.2.conv2.bn1.bias", "pretrained.layer4.2.conv2.bn1.running_mean", "pretrained.layer4.2.conv2.bn1.running_var", "pretrained.layer4.2.conv2.bn1.num_batches_tracked", "pretrained.layer4.2.conv2.fc2.weight", "pretrained.layer4.2.conv2.fc2.bias". size mismatch for pretrained.conv1.0.weight: copying a param with shape torch.Size([32, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 3, 3, 3]). size mismatch for pretrained.conv1.1.weight: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for pretrained.conv1.1.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for pretrained.conv1.1.running_mean: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for pretrained.conv1.1.running_var: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for pretrained.conv1.3.weight: copying a param with shape torch.Size([32, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for pretrained.conv1.4.weight: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for pretrained.conv1.4.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for pretrained.conv1.4.running_mean: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for pretrained.conv1.4.running_var: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for pretrained.conv1.6.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 64, 3, 3]). size mismatch for pretrained.bn1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for pretrained.bn1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for pretrained.bn1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for pretrained.bn1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for pretrained.layer1.0.conv1.weight: copying a param with shape torch.Size([64, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 128, 1, 1]). size mismatch for pretrained.layer1.0.downsample.1.weight: copying a param with shape torch.Size([256, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for pretrained.layer2.0.downsample.1.weight: copying a param with shape torch.Size([512, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for pretrained.layer3.0.downsample.1.weight: copying a param with shape torch.Size([1024, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for pretrained.layer4.0.downsample.1.weight: copying a param with shape torch.Size([2048, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([2048]).

zhanghang1989 commented 3 years ago

Did you use --resume PATH/TO/THE/MODEL?