[BUG] Can't load pretrained mobilenetv3_small with catavgmax pooling

huggingface / pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Apache License 2.0

32.35k stars 4.76k forks source link

Describe the bug Trying to load mobilenetv3_small with catavgmax pooling on timm 0.6.11 gives following error:

RuntimeError: Error(s) in loading state_dict for MobileNetV3:
        size mismatch for conv_head.weight: copying a param with shape torch.Size([1024, 576, 1, 1]) from checkpoint, the shape in current model is torch.Size([1024, 1152, 1, 1]).

To Reproduce

run:

import timm
timm.create_model(model_name='mobilenetv3_small_100', pretrained=True, global_pool='catavgmax',num_classes=2)

Expected behavior model should load

Desktop (please complete the following information):

Ubuntu 20.04
timm 0.6.11
torch 1.13.0+cu116
torchvision 0.14.0+cu116

@alicanb that one is not so easy to support, https://github.com/rwightman/pytorch-image-models/blob/main/timm/models/mobilenetv3.py#L187

For mobilenetv3, there is an extra layer after the pooling that doesn't exist for most other nets. Using catavgmax doubles with # features so requires reseting and reconfiguring that instead of just the final classifier layer. I do not currently have a clean mechanism to support this generically (to avoid per model customization) although had some designs re more flexible head adaptation that might cover this (but not going to happen right away).

For now, two alternatives that can work (but not ideal)

'avgmax' pooling (adds 0.5 * both avg and max so keeps # featuers the same
create model w/ pretrained=True and manually change pool, conv_head, and classifier after creation

huggingface / pytorch-image-models

[BUG] Can't load pretrained mobilenetv3_small with catavgmax pooling #1561