size mismatch for layer1.0.conv1.weight: copying a param with shape torch.Size([64, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 128, 1, 1]).
size mismatch for layer1.0.downsample.0.weight: copying a param with shape torch.Size([64, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 128, 1, 1]).
size mismatch for layer1.0.downsample.1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer1.0.downsample.1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer1.0.downsample.1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer1.0.downsample.1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for layer1.1.conv1.weight: copying a param with shape torch.Size([64, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 256, 1, 1]).
size mismatch for layer2.0.conv1.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 256, 1, 1]).
size mismatch for layer2.0.downsample.0.weight: copying a param with shape torch.Size([128, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 256, 1, 1]).
size mismatch for layer2.0.downsample.1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layer2.0.downsample.1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layer2.0.downsample.1.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layer2.0.downsample.1.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for layer2.1.conv1.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 512, 1, 1]).
size mismatch for layer3.0.conv1.weight: copying a param with shape torch.Size([256, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 512, 1, 1]).
size mismatch for layer3.0.downsample.0.weight: copying a param with shape torch.Size([256, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 512, 1, 1]).
size mismatch for layer3.0.downsample.1.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for layer3.0.downsample.1.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for layer3.0.downsample.1.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for layer3.0.downsample.1.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for layer3.1.conv1.weight: copying a param with shape torch.Size([256, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 1024, 1, 1]).
size mismatch for layer4.0.conv1.weight: copying a param with shape torch.Size([512, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 1024, 1, 1]).
size mismatch for layer4.0.downsample.0.weight: copying a param with shape torch.Size([512, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([2048, 1024, 1, 1]).
size mismatch for layer4.0.downsample.1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for layer4.0.downsample.1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for layer4.0.downsample.1.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for layer4.0.downsample.1.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for layer4.1.conv1.weight: copying a param with shape torch.Size([512, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 2048, 1, 1]).
I want to train with Resnet 18, But why This error occurs ? Unexpected key(s) in state_dict: "bn1.num_batches_tracked", "bn1._tmp_running_mean", "bn1._tmp_running_var", "bn1._running_iter", "bn2.num_batches_tracked", "bn2._tmp_running_mean", "bn2._tmp_running_var", "bn2._running_iter", "bn3.num_batches_tracked", "bn3._tmp_running_mean", "bn3._tmp_running_var", "bn3._running_iter", "layer1.0.bn1.num_batches_tracked", "layer1.0.bn1._tmp_running_mean", "layer1.0.bn1._tmp_running_var", "layer1.0.bn1._running_iter", "layer1.0.bn2.num_batches_tracked", "layer1.0.bn2._tmp_running_mean", "layer1.0.bn2._tmp_running_var", "layer1.0.bn2._running_iter", "layer1.0.downsample.1.num_batches_tracked", "layer1.0.downsample.1._tmp_running_mean", "layer1.0.downsample.1._tmp_running_var", "layer1.0.downsample.1._running_iter", "layer1.1.bn1.num_batches_tracked", "layer1.1.bn1._tmp_running_mean", "layer1.1.bn1._tmp_running_var", "layer1.1.bn1._running_iter", "layer1.1.bn2.num_batches_tracked", "layer1.1.bn2._tmp_running_mean", "layer1.1.bn2._tmp_running_var", "layer1.1.bn2._running_iter", "layer2.0.bn1.num_batches_tracked", "layer2.0.bn1._tmp_running_mean", "layer2.0.bn1._tmp_running_var", "layer2.0.bn1._running_iter", "layer2.0.bn2.num_batches_tracked", "layer2.0.bn2._tmp_running_mean", "layer2.0.bn2._tmp_running_var", "layer2.0.bn2._running_iter", "layer2.0.downsample.1.num_batches_tracked", "layer2.0.downsample.1._tmp_running_mean", "layer2.0.downsample.1._tmp_running_var", "layer2.0.downsample.1._running_iter", "layer2.1.bn1.num_batches_tracked", "layer2.1.bn1._tmp_running_mean", "layer2.1.bn1._tmp_running_var", "layer2.1.bn1._running_iter", "layer2.1.bn2.num_batches_tracked", "layer2.1.bn2._tmp_running_mean", "layer2.1.bn2._tmp_running_var", "layer2.1.bn2._running_iter", "layer3.0.bn1.num_batches_tracked", "layer3.0.bn1._tmp_running_mean", "layer3.0.bn1._tmp_running_var", "layer3.0.bn1._running_iter", "layer3.0.bn2.num_batches_tracked", "layer3.0.bn2._tmp_running_mean", "layer3.0.bn2._tmp_running_var", "layer3.0.bn2._running_iter", "layer3.0.downsample.1.num_batches_tracked", "layer3.0.downsample.1._tmp_running_mean", "layer3.0.downsample.1._tmp_running_var", "layer3.0.downsample.1._running_iter", "layer3.1.bn1.num_batches_tracked", "layer3.1.bn1._tmp_running_mean", "layer3.1.bn1._tmp_running_var", "layer3.1.bn1._running_iter", "layer3.1.bn2.num_batches_tracked", "layer3.1.bn2._tmp_running_mean", "layer3.1.bn2._tmp_running_var", "layer3.1.bn2._running_iter", "layer4.0.bn1.num_batches_tracked", "layer4.0.bn1._tmp_running_mean", "layer4.0.bn1._tmp_running_var", "layer4.0.bn1._running_iter", "layer4.0.bn2.num_batches_tracked", "layer4.0.bn2._tmp_running_mean", "layer4.0.bn2._tmp_running_var", "layer4.0.bn2._running_iter", "layer4.0.downsample.1.num_batches_tracked", "layer4.0.downsample.1._tmp_running_mean", "layer4.0.downsample.1._tmp_running_var", "layer4.0.downsample.1._running_iter", "layer4.1.bn1.num_batches_tracked", "layer4.1.bn1._tmp_running_mean", "layer4.1.bn1._tmp_running_var", "layer4.1.bn1._running_iter", "layer4.1.bn2.num_batches_tracked", "layer4.1.bn2._tmp_running_mean", "layer4.1.bn2._tmp_running_var", "layer4.1.bn2._running_iter".