Get errors when trying to load the pretrained models. In particular, it looks like the state_dict for the small model is actually based on the large model.
For the large model:
RuntimeError: Error(s) in loading state_dict for MobileNetV3:
Missing key(s) in state_dict: "classifier.0.weight", "classifier.0.bias", "classifier.3.weight", "classifier.3.bias".
Unexpected key(s) in state_dict: "classifier.1.weight", "classifier.1.bias", "classifier.5.weight", "classifier.5.bias".
For the small model:
RuntimeError: Error(s) in loading state_dict for MobileNetV3:
Missing key(s) in state_dict: "features.1.conv.3.fc.0.weight", "features.1.conv.3.fc.0.bias", "features.1.conv.3.fc.2.weight", "features.1.conv.3.fc.2.bias", "features.7.conv.5.fc.0.weight", "features.7.conv.5.fc.0.bias", "features.7.conv.5.fc.2.weight", "features.7.conv.5.fc.2.bias", "features.8.conv.5.fc.0.weight", "features.8.conv.5.fc.0.bias", "features.8.conv.5.fc.2.weight", "features.8.conv.5.fc.2.bias", "features.9.conv.5.fc.0.weight", "features.9.conv.5.fc.0.bias", "features.9.conv.5.fc.2.weight", "features.9.conv.5.fc.2.bias", "features.10.conv.5.fc.0.weight", "features.10.conv.5.fc.0.bias", "features.10.conv.5.fc.2.weight", "features.10.conv.5.fc.2.bias", "conv.1.fc.0.weight", "conv.1.fc.0.bias", "conv.1.fc.2.weight", "conv.1.fc.2.bias", "classifier.0.weight", "classifier.0.bias", "classifier.1.running_mean", "classifier.1.running_var", "classifier.3.weight", "classifier.3.bias", "classifier.4.weight", "classifier.4.bias", "classifier.4.running_mean", "classifier.4.running_var".
Unexpected key(s) in state_dict: "features.12.conv.0.weight", "features.12.conv.1.weight", "features.12.conv.1.bias", "features.12.conv.1.running_mean", "features.12.conv.1.running_var", "features.12.conv.1.num_batches_tracked", "features.12.conv.3.weight", "features.12.conv.4.weight", "features.12.conv.4.bias", "features.12.conv.4.running_mean", "features.12.conv.4.running_var", "features.12.conv.4.num_batches_tracked", "features.12.conv.5.fc.0.weight", "features.12.conv.5.fc.0.bias", "features.12.conv.5.fc.2.weight", "features.12.conv.5.fc.2.bias", "features.12.conv.7.weight", "features.12.conv.8.weight", "features.12.conv.8.bias", "features.12.conv.8.running_mean", "features.12.conv.8.running_var", "features.12.conv.8.num_batches_tracked", "features.13.conv.0.weight", "features.13.conv.1.weight", "features.13.conv.1.bias", "features.13.conv.1.running_mean", "features.13.conv.1.running_var", "features.13.conv.1.num_batches_tracked", "features.13.conv.3.weight", "features.13.conv.4.weight", "features.13.conv.4.bias", "features.13.conv.4.running_mean", "features.13.conv.4.running_var", "features.13.conv.4.num_batches_tracked", "features.13.conv.5.fc.0.weight", "features.13.conv.5.fc.0.bias", "features.13.conv.5.fc.2.weight", "features.13.conv.5.fc.2.bias", "features.13.conv.7.weight", "features.13.conv.8.weight", "features.13.conv.8.bias", "features.13.conv.8.running_mean", "features.13.conv.8.running_var", "features.13.conv.8.num_batches_tracked", "features.14.conv.0.weight", "features.14.conv.1.weight", "features.14.conv.1.bias", "features.14.conv.1.running_mean", "features.14.conv.1.running_var", "features.14.conv.1.num_batches_tracked", "features.14.conv.3.weight", "features.14.conv.4.weight", "features.14.conv.4.bias", "features.14.conv.4.running_mean", "features.14.conv.4.running_var", "features.14.conv.4.num_batches_tracked", "features.14.conv.5.fc.0.weight", "features.14.conv.5.fc.0.bias", "features.14.conv.5.fc.2.weight", "features.14.conv.5.fc.2.bias", "features.14.conv.7.weight", "features.14.conv.8.weight", "features.14.conv.8.bias", "features.14.conv.8.running_mean", "features.14.conv.8.running_var", "features.14.conv.8.num_batches_tracked", "features.15.conv.0.weight", "features.15.conv.1.weight", "features.15.conv.1.bias", "features.15.conv.1.running_mean", "features.15.conv.1.running_var", "features.15.conv.1.num_batches_tracked", "features.15.conv.3.weight", "features.15.conv.4.weight", "features.15.conv.4.bias", "features.15.conv.4.running_mean", "features.15.conv.4.running_var", "features.15.conv.4.num_batches_tracked", "features.15.conv.5.fc.0.weight", "features.15.conv.5.fc.0.bias", "features.15.conv.5.fc.2.weight", "features.15.conv.5.fc.2.bias", "features.15.conv.7.weight", "features.15.conv.8.weight", "features.15.conv.8.bias", "features.15.conv.8.running_mean", "features.15.conv.8.running_var", "features.15.conv.8.num_batches_tracked", "classifier.5.weight", "classifier.5.bias".
size mismatch for features.2.conv.0.weight: copying a param with shape torch.Size([64, 16, 1, 1]) from checkpoint, the shape in current model is torch.Size([72, 16, 1, 1]).
size mismatch for features.2.conv.1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([72]).
size mismatch for features.2.conv.1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([72]).
size mismatch for features.2.conv.1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([72]).
size mismatch for features.2.conv.1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([72]).
size mismatch for features.2.conv.3.weight: copying a param with shape torch.Size([64, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([72, 1, 3, 3]).
size mismatch for features.2.conv.4.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([72]).
size mismatch for features.2.conv.4.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([72]).
size mismatch for features.2.conv.4.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([72]).
size mismatch for features.2.conv.4.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([72]).
size mismatch for features.2.conv.7.weight: copying a param with shape torch.Size([24, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([24, 72, 1, 1]).
size mismatch for features.3.conv.0.weight: copying a param with shape torch.Size([72, 24, 1, 1]) from checkpoint, the shape in current model is torch.Size([88, 24, 1, 1]).
size mismatch for features.3.conv.1.weight: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([88]).
size mismatch for features.3.conv.1.bias: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([88]).
size mismatch for features.3.conv.1.running_mean: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([88]).
size mismatch for features.3.conv.1.running_var: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([88]).
size mismatch for features.3.conv.3.weight: copying a param with shape torch.Size([72, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([88, 1, 3, 3]).
size mismatch for features.3.conv.4.weight: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([88]).
size mismatch for features.3.conv.4.bias: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([88]).
size mismatch for features.3.conv.4.running_mean: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([88]).
size mismatch for features.3.conv.4.running_var: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([88]).
size mismatch for features.3.conv.7.weight: copying a param with shape torch.Size([24, 72, 1, 1]) from checkpoint, the shape in current model is torch.Size([24, 88, 1, 1]).
size mismatch for features.4.conv.0.weight: copying a param with shape torch.Size([72, 24, 1, 1]) from checkpoint, the shape in current model is torch.Size([96, 24, 1, 1]).
size mismatch for features.4.conv.1.weight: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for features.4.conv.1.bias: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for features.4.conv.1.running_mean: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for features.4.conv.1.running_var: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for features.4.conv.3.weight: copying a param with shape torch.Size([72, 1, 5, 5]) from checkpoint, the shape in current model is torch.Size([96, 1, 5, 5]).
size mismatch for features.4.conv.4.weight: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for features.4.conv.4.bias: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for features.4.conv.4.running_mean: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for features.4.conv.4.running_var: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for features.4.conv.5.fc.0.weight: copying a param with shape torch.Size([18, 72]) from checkpoint, the shape in current model is torch.Size([24, 96]).
size mismatch for features.4.conv.5.fc.0.bias: copying a param with shape torch.Size([18]) from checkpoint, the shape in current model is torch.Size([24]).
size mismatch for features.4.conv.5.fc.2.weight: copying a param with shape torch.Size([72, 18]) from checkpoint, the shape in current model is torch.Size([96, 24]).
size mismatch for features.4.conv.5.fc.2.bias: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for features.4.conv.7.weight: copying a param with shape torch.Size([40, 72, 1, 1]) from checkpoint, the shape in current model is torch.Size([40, 96, 1, 1]).
size mismatch for features.5.conv.0.weight: copying a param with shape torch.Size([120, 40, 1, 1]) from checkpoint, the shape in current model is torch.Size([240, 40, 1, 1]).
size mismatch for features.5.conv.1.weight: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]).
size mismatch for features.5.conv.1.bias: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]).
size mismatch for features.5.conv.1.running_mean: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]).
size mismatch for features.5.conv.1.running_var: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]).
size mismatch for features.5.conv.3.weight: copying a param with shape torch.Size([120, 1, 5, 5]) from checkpoint, the shape in current model is torch.Size([240, 1, 5, 5]).
size mismatch for features.5.conv.4.weight: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]).
size mismatch for features.5.conv.4.bias: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]).
size mismatch for features.5.conv.4.running_mean: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]).
size mismatch for features.5.conv.4.running_var: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]).
size mismatch for features.5.conv.5.fc.0.weight: copying a param with shape torch.Size([30, 120]) from checkpoint, the shape in current model is torch.Size([60, 240]).
size mismatch for features.5.conv.5.fc.0.bias: copying a param with shape torch.Size([30]) from checkpoint, the shape in current model is torch.Size([60]).
size mismatch for features.5.conv.5.fc.2.weight: copying a param with shape torch.Size([120, 30]) from checkpoint, the shape in current model is torch.Size([240, 60]).
size mismatch for features.5.conv.5.fc.2.bias: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]).
size mismatch for features.5.conv.7.weight: copying a param with shape torch.Size([40, 120, 1, 1]) from checkpoint, the shape in current model is torch.Size([40, 240, 1, 1]).
size mismatch for features.6.conv.0.weight: copying a param with shape torch.Size([120, 40, 1, 1]) from checkpoint, the shape in current model is torch.Size([240, 40, 1, 1]).
size mismatch for features.6.conv.1.weight: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]).
size mismatch for features.6.conv.1.bias: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]).
size mismatch for features.6.conv.1.running_mean: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]).
size mismatch for features.6.conv.1.running_var: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]).
size mismatch for features.6.conv.3.weight: copying a param with shape torch.Size([120, 1, 5, 5]) from checkpoint, the shape in current model is torch.Size([240, 1, 5, 5]).
size mismatch for features.6.conv.4.weight: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]).
size mismatch for features.6.conv.4.bias: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]).
size mismatch for features.6.conv.4.running_mean: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]).
size mismatch for features.6.conv.4.running_var: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]).
size mismatch for features.6.conv.5.fc.0.weight: copying a param with shape torch.Size([30, 120]) from checkpoint, the shape in current model is torch.Size([60, 240]).
size mismatch for features.6.conv.5.fc.0.bias: copying a param with shape torch.Size([30]) from checkpoint, the shape in current model is torch.Size([60]).
size mismatch for features.6.conv.5.fc.2.weight: copying a param with shape torch.Size([120, 30]) from checkpoint, the shape in current model is torch.Size([240, 60]).
size mismatch for features.6.conv.5.fc.2.bias: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]).
size mismatch for features.6.conv.7.weight: copying a param with shape torch.Size([40, 120, 1, 1]) from checkpoint, the shape in current model is torch.Size([40, 240, 1, 1]).
size mismatch for features.7.conv.0.weight: copying a param with shape torch.Size([240, 40, 1, 1]) from checkpoint, the shape in current model is torch.Size([120, 40, 1, 1]).
size mismatch for features.7.conv.1.weight: copying a param with shape torch.Size([240]) from checkpoint, the shape in current model is torch.Size([120]).
size mismatch for features.7.conv.1.bias: copying a param with shape torch.Size([240]) from checkpoint, the shape in current model is torch.Size([120]).
size mismatch for features.7.conv.1.running_mean: copying a param with shape torch.Size([240]) from checkpoint, the shape in current model is torch.Size([120]).
size mismatch for features.7.conv.1.running_var: copying a param with shape torch.Size([240]) from checkpoint, the shape in current model is torch.Size([120]).
size mismatch for features.7.conv.3.weight: copying a param with shape torch.Size([240, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([120, 1, 5, 5]).
size mismatch for features.7.conv.4.weight: copying a param with shape torch.Size([240]) from checkpoint, the shape in current model is torch.Size([120]).
size mismatch for features.7.conv.4.bias: copying a param with shape torch.Size([240]) from checkpoint, the shape in current model is torch.Size([120]).
size mismatch for features.7.conv.4.running_mean: copying a param with shape torch.Size([240]) from checkpoint, the shape in current model is torch.Size([120]).
size mismatch for features.7.conv.4.running_var: copying a param with shape torch.Size([240]) from checkpoint, the shape in current model is torch.Size([120]).
size mismatch for features.7.conv.7.weight: copying a param with shape torch.Size([80, 240, 1, 1]) from checkpoint, the shape in current model is torch.Size([48, 120, 1, 1]).
size mismatch for features.7.conv.8.weight: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([48]).
size mismatch for features.7.conv.8.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([48]).
size mismatch for features.7.conv.8.running_mean: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([48]).
size mismatch for features.7.conv.8.running_var: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([48]).
size mismatch for features.8.conv.0.weight: copying a param with shape torch.Size([200, 80, 1, 1]) from checkpoint, the shape in current model is torch.Size([144, 48, 1, 1]).
size mismatch for features.8.conv.1.weight: copying a param with shape torch.Size([200]) from checkpoint, the shape in current model is torch.Size([144]).
size mismatch for features.8.conv.1.bias: copying a param with shape torch.Size([200]) from checkpoint, the shape in current model is torch.Size([144]).
size mismatch for features.8.conv.1.running_mean: copying a param with shape torch.Size([200]) from checkpoint, the shape in current model is torch.Size([144]).
size mismatch for features.8.conv.1.running_var: copying a param with shape torch.Size([200]) from checkpoint, the shape in current model is torch.Size([144]).
size mismatch for features.8.conv.3.weight: copying a param with shape torch.Size([200, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([144, 1, 5, 5]).
size mismatch for features.8.conv.4.weight: copying a param with shape torch.Size([200]) from checkpoint, the shape in current model is torch.Size([144]).
size mismatch for features.8.conv.4.bias: copying a param with shape torch.Size([200]) from checkpoint, the shape in current model is torch.Size([144]).
size mismatch for features.8.conv.4.running_mean: copying a param with shape torch.Size([200]) from checkpoint, the shape in current model is torch.Size([144]).
size mismatch for features.8.conv.4.running_var: copying a param with shape torch.Size([200]) from checkpoint, the shape in current model is torch.Size([144]).
size mismatch for features.8.conv.7.weight: copying a param with shape torch.Size([80, 200, 1, 1]) from checkpoint, the shape in current model is torch.Size([48, 144, 1, 1]).
size mismatch for features.8.conv.8.weight: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([48]).
size mismatch for features.8.conv.8.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([48]).
size mismatch for features.8.conv.8.running_mean: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([48]).
size mismatch for features.8.conv.8.running_var: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([48]).
size mismatch for features.9.conv.0.weight: copying a param with shape torch.Size([184, 80, 1, 1]) from checkpoint, the shape in current model is torch.Size([288, 48, 1, 1]).
size mismatch for features.9.conv.1.weight: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([288]).
size mismatch for features.9.conv.1.bias: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([288]).
size mismatch for features.9.conv.1.running_mean: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([288]).
size mismatch for features.9.conv.1.running_var: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([288]).
size mismatch for features.9.conv.3.weight: copying a param with shape torch.Size([184, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([288, 1, 5, 5]).
size mismatch for features.9.conv.4.weight: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([288]).
size mismatch for features.9.conv.4.bias: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([288]).
size mismatch for features.9.conv.4.running_mean: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([288]).
size mismatch for features.9.conv.4.running_var: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([288]).
size mismatch for features.9.conv.7.weight: copying a param with shape torch.Size([80, 184, 1, 1]) from checkpoint, the shape in current model is torch.Size([96, 288, 1, 1]).
size mismatch for features.9.conv.8.weight: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for features.9.conv.8.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for features.9.conv.8.running_mean: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for features.9.conv.8.running_var: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for features.10.conv.0.weight: copying a param with shape torch.Size([184, 80, 1, 1]) from checkpoint, the shape in current model is torch.Size([576, 96, 1, 1]).
size mismatch for features.10.conv.1.weight: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for features.10.conv.1.bias: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for features.10.conv.1.running_mean: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for features.10.conv.1.running_var: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for features.10.conv.3.weight: copying a param with shape torch.Size([184, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([576, 1, 5, 5]).
size mismatch for features.10.conv.4.weight: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for features.10.conv.4.bias: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for features.10.conv.4.running_mean: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for features.10.conv.4.running_var: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for features.10.conv.7.weight: copying a param with shape torch.Size([80, 184, 1, 1]) from checkpoint, the shape in current model is torch.Size([96, 576, 1, 1]).
size mismatch for features.10.conv.8.weight: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for features.10.conv.8.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for features.10.conv.8.running_mean: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for features.10.conv.8.running_var: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for features.11.conv.0.weight: copying a param with shape torch.Size([480, 80, 1, 1]) from checkpoint, the shape in current model is torch.Size([576, 96, 1, 1]).
size mismatch for features.11.conv.1.weight: copying a param with shape torch.Size([480]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for features.11.conv.1.bias: copying a param with shape torch.Size([480]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for features.11.conv.1.running_mean: copying a param with shape torch.Size([480]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for features.11.conv.1.running_var: copying a param with shape torch.Size([480]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for features.11.conv.3.weight: copying a param with shape torch.Size([480, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([576, 1, 5, 5]).
size mismatch for features.11.conv.4.weight: copying a param with shape torch.Size([480]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for features.11.conv.4.bias: copying a param with shape torch.Size([480]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for features.11.conv.4.running_mean: copying a param with shape torch.Size([480]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for features.11.conv.4.running_var: copying a param with shape torch.Size([480]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for features.11.conv.5.fc.0.weight: copying a param with shape torch.Size([120, 480]) from checkpoint, the shape in current model is torch.Size([144, 576]).
size mismatch for features.11.conv.5.fc.0.bias: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([144]).
size mismatch for features.11.conv.5.fc.2.weight: copying a param with shape torch.Size([480, 120]) from checkpoint, the shape in current model is torch.Size([576, 144]).
size mismatch for features.11.conv.5.fc.2.bias: copying a param with shape torch.Size([480]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for features.11.conv.7.weight: copying a param with shape torch.Size([112, 480, 1, 1]) from checkpoint, the shape in current model is torch.Size([96, 576, 1, 1]).
size mismatch for features.11.conv.8.weight: copying a param with shape torch.Size([112]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for features.11.conv.8.bias: copying a param with shape torch.Size([112]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for features.11.conv.8.running_mean: copying a param with shape torch.Size([112]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for features.11.conv.8.running_var: copying a param with shape torch.Size([112]) from checkpoint, the shape in current model is torch.Size([96]).
size mismatch for conv.0.0.weight: copying a param with shape torch.Size([960, 160, 1, 1]) from checkpoint, the shape in current model is torch.Size([576, 96, 1, 1]).
size mismatch for conv.0.1.weight: copying a param with shape torch.Size([960]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for conv.0.1.bias: copying a param with shape torch.Size([960]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for conv.0.1.running_mean: copying a param with shape torch.Size([960]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for conv.0.1.running_var: copying a param with shape torch.Size([960]) from checkpoint, the shape in current model is torch.Size([576]).
size mismatch for classifier.1.weight: copying a param with shape torch.Size([1280, 960]) from checkpoint, the shape in current model is torch.Size([1280]).
Get errors when trying to load the pretrained models. In particular, it looks like the state_dict for the small model is actually based on the large model.
For the large model:
RuntimeError: Error(s) in loading state_dict for MobileNetV3: Missing key(s) in state_dict: "classifier.0.weight", "classifier.0.bias", "classifier.3.weight", "classifier.3.bias". Unexpected key(s) in state_dict: "classifier.1.weight", "classifier.1.bias", "classifier.5.weight", "classifier.5.bias".
For the small model:
RuntimeError: Error(s) in loading state_dict for MobileNetV3: Missing key(s) in state_dict: "features.1.conv.3.fc.0.weight", "features.1.conv.3.fc.0.bias", "features.1.conv.3.fc.2.weight", "features.1.conv.3.fc.2.bias", "features.7.conv.5.fc.0.weight", "features.7.conv.5.fc.0.bias", "features.7.conv.5.fc.2.weight", "features.7.conv.5.fc.2.bias", "features.8.conv.5.fc.0.weight", "features.8.conv.5.fc.0.bias", "features.8.conv.5.fc.2.weight", "features.8.conv.5.fc.2.bias", "features.9.conv.5.fc.0.weight", "features.9.conv.5.fc.0.bias", "features.9.conv.5.fc.2.weight", "features.9.conv.5.fc.2.bias", "features.10.conv.5.fc.0.weight", "features.10.conv.5.fc.0.bias", "features.10.conv.5.fc.2.weight", "features.10.conv.5.fc.2.bias", "conv.1.fc.0.weight", "conv.1.fc.0.bias", "conv.1.fc.2.weight", "conv.1.fc.2.bias", "classifier.0.weight", "classifier.0.bias", "classifier.1.running_mean", "classifier.1.running_var", "classifier.3.weight", "classifier.3.bias", "classifier.4.weight", "classifier.4.bias", "classifier.4.running_mean", "classifier.4.running_var". Unexpected key(s) in state_dict: "features.12.conv.0.weight", "features.12.conv.1.weight", "features.12.conv.1.bias", "features.12.conv.1.running_mean", "features.12.conv.1.running_var", "features.12.conv.1.num_batches_tracked", "features.12.conv.3.weight", "features.12.conv.4.weight", "features.12.conv.4.bias", "features.12.conv.4.running_mean", "features.12.conv.4.running_var", "features.12.conv.4.num_batches_tracked", "features.12.conv.5.fc.0.weight", "features.12.conv.5.fc.0.bias", "features.12.conv.5.fc.2.weight", "features.12.conv.5.fc.2.bias", "features.12.conv.7.weight", "features.12.conv.8.weight", "features.12.conv.8.bias", "features.12.conv.8.running_mean", "features.12.conv.8.running_var", "features.12.conv.8.num_batches_tracked", "features.13.conv.0.weight", "features.13.conv.1.weight", "features.13.conv.1.bias", "features.13.conv.1.running_mean", "features.13.conv.1.running_var", "features.13.conv.1.num_batches_tracked", "features.13.conv.3.weight", "features.13.conv.4.weight", "features.13.conv.4.bias", "features.13.conv.4.running_mean", "features.13.conv.4.running_var", "features.13.conv.4.num_batches_tracked", "features.13.conv.5.fc.0.weight", "features.13.conv.5.fc.0.bias", "features.13.conv.5.fc.2.weight", "features.13.conv.5.fc.2.bias", "features.13.conv.7.weight", "features.13.conv.8.weight", "features.13.conv.8.bias", "features.13.conv.8.running_mean", "features.13.conv.8.running_var", "features.13.conv.8.num_batches_tracked", "features.14.conv.0.weight", "features.14.conv.1.weight", "features.14.conv.1.bias", "features.14.conv.1.running_mean", "features.14.conv.1.running_var", "features.14.conv.1.num_batches_tracked", "features.14.conv.3.weight", "features.14.conv.4.weight", "features.14.conv.4.bias", "features.14.conv.4.running_mean", "features.14.conv.4.running_var", "features.14.conv.4.num_batches_tracked", "features.14.conv.5.fc.0.weight", "features.14.conv.5.fc.0.bias", "features.14.conv.5.fc.2.weight", "features.14.conv.5.fc.2.bias", "features.14.conv.7.weight", "features.14.conv.8.weight", "features.14.conv.8.bias", "features.14.conv.8.running_mean", "features.14.conv.8.running_var", "features.14.conv.8.num_batches_tracked", "features.15.conv.0.weight", "features.15.conv.1.weight", "features.15.conv.1.bias", "features.15.conv.1.running_mean", "features.15.conv.1.running_var", "features.15.conv.1.num_batches_tracked", "features.15.conv.3.weight", "features.15.conv.4.weight", "features.15.conv.4.bias", "features.15.conv.4.running_mean", "features.15.conv.4.running_var", "features.15.conv.4.num_batches_tracked", "features.15.conv.5.fc.0.weight", "features.15.conv.5.fc.0.bias", "features.15.conv.5.fc.2.weight", "features.15.conv.5.fc.2.bias", "features.15.conv.7.weight", "features.15.conv.8.weight", "features.15.conv.8.bias", "features.15.conv.8.running_mean", "features.15.conv.8.running_var", "features.15.conv.8.num_batches_tracked", "classifier.5.weight", "classifier.5.bias". size mismatch for features.2.conv.0.weight: copying a param with shape torch.Size([64, 16, 1, 1]) from checkpoint, the shape in current model is torch.Size([72, 16, 1, 1]). size mismatch for features.2.conv.1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([72]). size mismatch for features.2.conv.1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([72]). size mismatch for features.2.conv.1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([72]). size mismatch for features.2.conv.1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([72]). size mismatch for features.2.conv.3.weight: copying a param with shape torch.Size([64, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([72, 1, 3, 3]). size mismatch for features.2.conv.4.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([72]). size mismatch for features.2.conv.4.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([72]). size mismatch for features.2.conv.4.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([72]). size mismatch for features.2.conv.4.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([72]). size mismatch for features.2.conv.7.weight: copying a param with shape torch.Size([24, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([24, 72, 1, 1]). size mismatch for features.3.conv.0.weight: copying a param with shape torch.Size([72, 24, 1, 1]) from checkpoint, the shape in current model is torch.Size([88, 24, 1, 1]). size mismatch for features.3.conv.1.weight: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([88]). size mismatch for features.3.conv.1.bias: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([88]). size mismatch for features.3.conv.1.running_mean: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([88]). size mismatch for features.3.conv.1.running_var: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([88]). size mismatch for features.3.conv.3.weight: copying a param with shape torch.Size([72, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([88, 1, 3, 3]). size mismatch for features.3.conv.4.weight: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([88]). size mismatch for features.3.conv.4.bias: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([88]). size mismatch for features.3.conv.4.running_mean: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([88]). size mismatch for features.3.conv.4.running_var: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([88]). size mismatch for features.3.conv.7.weight: copying a param with shape torch.Size([24, 72, 1, 1]) from checkpoint, the shape in current model is torch.Size([24, 88, 1, 1]). size mismatch for features.4.conv.0.weight: copying a param with shape torch.Size([72, 24, 1, 1]) from checkpoint, the shape in current model is torch.Size([96, 24, 1, 1]). size mismatch for features.4.conv.1.weight: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for features.4.conv.1.bias: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for features.4.conv.1.running_mean: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for features.4.conv.1.running_var: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for features.4.conv.3.weight: copying a param with shape torch.Size([72, 1, 5, 5]) from checkpoint, the shape in current model is torch.Size([96, 1, 5, 5]). size mismatch for features.4.conv.4.weight: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for features.4.conv.4.bias: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for features.4.conv.4.running_mean: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for features.4.conv.4.running_var: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for features.4.conv.5.fc.0.weight: copying a param with shape torch.Size([18, 72]) from checkpoint, the shape in current model is torch.Size([24, 96]). size mismatch for features.4.conv.5.fc.0.bias: copying a param with shape torch.Size([18]) from checkpoint, the shape in current model is torch.Size([24]). size mismatch for features.4.conv.5.fc.2.weight: copying a param with shape torch.Size([72, 18]) from checkpoint, the shape in current model is torch.Size([96, 24]). size mismatch for features.4.conv.5.fc.2.bias: copying a param with shape torch.Size([72]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for features.4.conv.7.weight: copying a param with shape torch.Size([40, 72, 1, 1]) from checkpoint, the shape in current model is torch.Size([40, 96, 1, 1]). size mismatch for features.5.conv.0.weight: copying a param with shape torch.Size([120, 40, 1, 1]) from checkpoint, the shape in current model is torch.Size([240, 40, 1, 1]). size mismatch for features.5.conv.1.weight: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]). size mismatch for features.5.conv.1.bias: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]). size mismatch for features.5.conv.1.running_mean: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]). size mismatch for features.5.conv.1.running_var: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]). size mismatch for features.5.conv.3.weight: copying a param with shape torch.Size([120, 1, 5, 5]) from checkpoint, the shape in current model is torch.Size([240, 1, 5, 5]). size mismatch for features.5.conv.4.weight: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]). size mismatch for features.5.conv.4.bias: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]). size mismatch for features.5.conv.4.running_mean: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]). size mismatch for features.5.conv.4.running_var: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]). size mismatch for features.5.conv.5.fc.0.weight: copying a param with shape torch.Size([30, 120]) from checkpoint, the shape in current model is torch.Size([60, 240]). size mismatch for features.5.conv.5.fc.0.bias: copying a param with shape torch.Size([30]) from checkpoint, the shape in current model is torch.Size([60]). size mismatch for features.5.conv.5.fc.2.weight: copying a param with shape torch.Size([120, 30]) from checkpoint, the shape in current model is torch.Size([240, 60]). size mismatch for features.5.conv.5.fc.2.bias: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]). size mismatch for features.5.conv.7.weight: copying a param with shape torch.Size([40, 120, 1, 1]) from checkpoint, the shape in current model is torch.Size([40, 240, 1, 1]). size mismatch for features.6.conv.0.weight: copying a param with shape torch.Size([120, 40, 1, 1]) from checkpoint, the shape in current model is torch.Size([240, 40, 1, 1]). size mismatch for features.6.conv.1.weight: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]). size mismatch for features.6.conv.1.bias: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]). size mismatch for features.6.conv.1.running_mean: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]). size mismatch for features.6.conv.1.running_var: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]). size mismatch for features.6.conv.3.weight: copying a param with shape torch.Size([120, 1, 5, 5]) from checkpoint, the shape in current model is torch.Size([240, 1, 5, 5]). size mismatch for features.6.conv.4.weight: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]). size mismatch for features.6.conv.4.bias: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]). size mismatch for features.6.conv.4.running_mean: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]). size mismatch for features.6.conv.4.running_var: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]). size mismatch for features.6.conv.5.fc.0.weight: copying a param with shape torch.Size([30, 120]) from checkpoint, the shape in current model is torch.Size([60, 240]). size mismatch for features.6.conv.5.fc.0.bias: copying a param with shape torch.Size([30]) from checkpoint, the shape in current model is torch.Size([60]). size mismatch for features.6.conv.5.fc.2.weight: copying a param with shape torch.Size([120, 30]) from checkpoint, the shape in current model is torch.Size([240, 60]). size mismatch for features.6.conv.5.fc.2.bias: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([240]). size mismatch for features.6.conv.7.weight: copying a param with shape torch.Size([40, 120, 1, 1]) from checkpoint, the shape in current model is torch.Size([40, 240, 1, 1]). size mismatch for features.7.conv.0.weight: copying a param with shape torch.Size([240, 40, 1, 1]) from checkpoint, the shape in current model is torch.Size([120, 40, 1, 1]). size mismatch for features.7.conv.1.weight: copying a param with shape torch.Size([240]) from checkpoint, the shape in current model is torch.Size([120]). size mismatch for features.7.conv.1.bias: copying a param with shape torch.Size([240]) from checkpoint, the shape in current model is torch.Size([120]). size mismatch for features.7.conv.1.running_mean: copying a param with shape torch.Size([240]) from checkpoint, the shape in current model is torch.Size([120]). size mismatch for features.7.conv.1.running_var: copying a param with shape torch.Size([240]) from checkpoint, the shape in current model is torch.Size([120]). size mismatch for features.7.conv.3.weight: copying a param with shape torch.Size([240, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([120, 1, 5, 5]). size mismatch for features.7.conv.4.weight: copying a param with shape torch.Size([240]) from checkpoint, the shape in current model is torch.Size([120]). size mismatch for features.7.conv.4.bias: copying a param with shape torch.Size([240]) from checkpoint, the shape in current model is torch.Size([120]). size mismatch for features.7.conv.4.running_mean: copying a param with shape torch.Size([240]) from checkpoint, the shape in current model is torch.Size([120]). size mismatch for features.7.conv.4.running_var: copying a param with shape torch.Size([240]) from checkpoint, the shape in current model is torch.Size([120]). size mismatch for features.7.conv.7.weight: copying a param with shape torch.Size([80, 240, 1, 1]) from checkpoint, the shape in current model is torch.Size([48, 120, 1, 1]). size mismatch for features.7.conv.8.weight: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([48]). size mismatch for features.7.conv.8.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([48]). size mismatch for features.7.conv.8.running_mean: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([48]). size mismatch for features.7.conv.8.running_var: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([48]). size mismatch for features.8.conv.0.weight: copying a param with shape torch.Size([200, 80, 1, 1]) from checkpoint, the shape in current model is torch.Size([144, 48, 1, 1]). size mismatch for features.8.conv.1.weight: copying a param with shape torch.Size([200]) from checkpoint, the shape in current model is torch.Size([144]). size mismatch for features.8.conv.1.bias: copying a param with shape torch.Size([200]) from checkpoint, the shape in current model is torch.Size([144]). size mismatch for features.8.conv.1.running_mean: copying a param with shape torch.Size([200]) from checkpoint, the shape in current model is torch.Size([144]). size mismatch for features.8.conv.1.running_var: copying a param with shape torch.Size([200]) from checkpoint, the shape in current model is torch.Size([144]). size mismatch for features.8.conv.3.weight: copying a param with shape torch.Size([200, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([144, 1, 5, 5]). size mismatch for features.8.conv.4.weight: copying a param with shape torch.Size([200]) from checkpoint, the shape in current model is torch.Size([144]). size mismatch for features.8.conv.4.bias: copying a param with shape torch.Size([200]) from checkpoint, the shape in current model is torch.Size([144]). size mismatch for features.8.conv.4.running_mean: copying a param with shape torch.Size([200]) from checkpoint, the shape in current model is torch.Size([144]). size mismatch for features.8.conv.4.running_var: copying a param with shape torch.Size([200]) from checkpoint, the shape in current model is torch.Size([144]). size mismatch for features.8.conv.7.weight: copying a param with shape torch.Size([80, 200, 1, 1]) from checkpoint, the shape in current model is torch.Size([48, 144, 1, 1]). size mismatch for features.8.conv.8.weight: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([48]). size mismatch for features.8.conv.8.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([48]). size mismatch for features.8.conv.8.running_mean: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([48]). size mismatch for features.8.conv.8.running_var: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([48]). size mismatch for features.9.conv.0.weight: copying a param with shape torch.Size([184, 80, 1, 1]) from checkpoint, the shape in current model is torch.Size([288, 48, 1, 1]). size mismatch for features.9.conv.1.weight: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([288]). size mismatch for features.9.conv.1.bias: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([288]). size mismatch for features.9.conv.1.running_mean: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([288]). size mismatch for features.9.conv.1.running_var: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([288]). size mismatch for features.9.conv.3.weight: copying a param with shape torch.Size([184, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([288, 1, 5, 5]). size mismatch for features.9.conv.4.weight: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([288]). size mismatch for features.9.conv.4.bias: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([288]). size mismatch for features.9.conv.4.running_mean: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([288]). size mismatch for features.9.conv.4.running_var: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([288]). size mismatch for features.9.conv.7.weight: copying a param with shape torch.Size([80, 184, 1, 1]) from checkpoint, the shape in current model is torch.Size([96, 288, 1, 1]). size mismatch for features.9.conv.8.weight: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for features.9.conv.8.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for features.9.conv.8.running_mean: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for features.9.conv.8.running_var: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for features.10.conv.0.weight: copying a param with shape torch.Size([184, 80, 1, 1]) from checkpoint, the shape in current model is torch.Size([576, 96, 1, 1]). size mismatch for features.10.conv.1.weight: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for features.10.conv.1.bias: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for features.10.conv.1.running_mean: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for features.10.conv.1.running_var: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for features.10.conv.3.weight: copying a param with shape torch.Size([184, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([576, 1, 5, 5]). size mismatch for features.10.conv.4.weight: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for features.10.conv.4.bias: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for features.10.conv.4.running_mean: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for features.10.conv.4.running_var: copying a param with shape torch.Size([184]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for features.10.conv.7.weight: copying a param with shape torch.Size([80, 184, 1, 1]) from checkpoint, the shape in current model is torch.Size([96, 576, 1, 1]). size mismatch for features.10.conv.8.weight: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for features.10.conv.8.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for features.10.conv.8.running_mean: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for features.10.conv.8.running_var: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for features.11.conv.0.weight: copying a param with shape torch.Size([480, 80, 1, 1]) from checkpoint, the shape in current model is torch.Size([576, 96, 1, 1]). size mismatch for features.11.conv.1.weight: copying a param with shape torch.Size([480]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for features.11.conv.1.bias: copying a param with shape torch.Size([480]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for features.11.conv.1.running_mean: copying a param with shape torch.Size([480]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for features.11.conv.1.running_var: copying a param with shape torch.Size([480]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for features.11.conv.3.weight: copying a param with shape torch.Size([480, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([576, 1, 5, 5]). size mismatch for features.11.conv.4.weight: copying a param with shape torch.Size([480]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for features.11.conv.4.bias: copying a param with shape torch.Size([480]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for features.11.conv.4.running_mean: copying a param with shape torch.Size([480]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for features.11.conv.4.running_var: copying a param with shape torch.Size([480]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for features.11.conv.5.fc.0.weight: copying a param with shape torch.Size([120, 480]) from checkpoint, the shape in current model is torch.Size([144, 576]). size mismatch for features.11.conv.5.fc.0.bias: copying a param with shape torch.Size([120]) from checkpoint, the shape in current model is torch.Size([144]). size mismatch for features.11.conv.5.fc.2.weight: copying a param with shape torch.Size([480, 120]) from checkpoint, the shape in current model is torch.Size([576, 144]). size mismatch for features.11.conv.5.fc.2.bias: copying a param with shape torch.Size([480]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for features.11.conv.7.weight: copying a param with shape torch.Size([112, 480, 1, 1]) from checkpoint, the shape in current model is torch.Size([96, 576, 1, 1]). size mismatch for features.11.conv.8.weight: copying a param with shape torch.Size([112]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for features.11.conv.8.bias: copying a param with shape torch.Size([112]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for features.11.conv.8.running_mean: copying a param with shape torch.Size([112]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for features.11.conv.8.running_var: copying a param with shape torch.Size([112]) from checkpoint, the shape in current model is torch.Size([96]). size mismatch for conv.0.0.weight: copying a param with shape torch.Size([960, 160, 1, 1]) from checkpoint, the shape in current model is torch.Size([576, 96, 1, 1]). size mismatch for conv.0.1.weight: copying a param with shape torch.Size([960]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for conv.0.1.bias: copying a param with shape torch.Size([960]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for conv.0.1.running_mean: copying a param with shape torch.Size([960]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for conv.0.1.running_var: copying a param with shape torch.Size([960]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for classifier.1.weight: copying a param with shape torch.Size([1280, 960]) from checkpoint, the shape in current model is torch.Size([1280]).