OpenGVLab / InternImage

[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
https://arxiv.org/abs/2211.05778
MIT License
2.52k stars 234 forks source link

_IncompatibleKeys while using pretrained model #172

Open Shanci-Li opened 1 year ago

Shanci-Li commented 1 year ago

Hello! I am using the InternImage framework to do transfer learning on 2 datasets for semantic segmentation. I first trained the upernet_internimage_b_512x1024_160k.py model on one dataset and tried to finetune it on the other dataset. But when I import the model I have trained, the following error occurs:

2023-05-30 10:40:19,457 - mmseg - INFO - load checkpoint from local path: /scratch/izar/shanli/Cadmap/internimage/upernet_internimage_b_512x1024_160k_geneva_line/best_mIoU_iter_42000.pth 2023-05-30 10:40:27,841 - mmseg - INFO - _IncompatibleKeys(missing_keys=[], unexpected_keys=['decode_head.conv_seg.weight', 'decode_head.conv_seg.bias', 'decode_head.psp_modules.0.1.conv.weight', 'decode_head.psp_modules.0.1.bn.weight', 'decode_head.psp_modules.0.1.bn.bias', 'decode_head.psp_modules.0.1.bn.running_mean', 'decode_head.psp_modules.0.1.bn.running_var', 'decode_head.psp_modules.0.1.bn.num_batches_tracked', 'decode_head.psp_modules.1.1.conv.weight', 'decode_head.psp_modules.1.1.bn.weight', 'decode_head.psp_modules.1.1.bn.bias', 'decode_head.psp_modules.1.1.bn.running_mean', 'decode_head.psp_modules.1.1.bn.running_var', 'decode_head.psp_modules.1.1.bn.num_batches_tracked', 'decode_head.psp_modules.2.1.conv.weight', 'decode_head.psp_modules.2.1.bn.weight', 'decode_head.psp_modules.2.1.bn.bias', 'decode_head.psp_modules.2.1.bn.running_mean', 'decode_head.psp_modules.2.1.bn.running_var', 'decode_head.psp_modules.2.1.bn.num_batches_tracked', 'decode_head.psp_modules.3.1.conv.weight', 'decode_head.psp_modules.3.1.bn.weight', 'decode_head.psp_modules.3.1.bn.bias', 'decode_head.psp_modules.3.1.bn.running_mean', 'decode_head.psp_modules.3.1.bn.running_var', 'decode_head.psp_modules.3.1.bn.num_batches_tracked', 'decode_head.bottleneck.conv.weight', 'decode_head.bottleneck.bn.weight', 'decode_head.bottleneck.bn.bias', 'decode_head.bottleneck.bn.running_mean', 'decode_head.bottleneck.bn.running_var', 'decode_head.bottleneck.bn.num_batches_tracked', 'decode_head.lateral_convs.0.conv.weight', 'decode_head.lateral_convs.0.bn.weight', 'decode_head.lateral_convs.0.bn.bias', 'decode_head.lateral_convs.0.bn.running_mean', 'decode_head.lateral_convs.0.bn.running_var', 'decode_head.lateral_convs.0.bn.num_batches_tracked', 'decode_head.lateral_convs.1.conv.weight', 'decode_head.lateral_convs.1.bn.weight', 'decode_head.lateral_convs.1.bn.bias', 'decode_head.lateral_convs.1.bn.running_mean', 'decode_head.lateral_convs.1.bn.running_var', 'decode_head.lateral_convs.1.bn.num_batches_tracked', 'decode_head.lateral_convs.2.conv.weight', 'decode_head.lateral_convs.2.bn.weight', 'decode_head.lateral_convs.2.bn.bias', 'decode_head.lateral_convs.2.bn.running_mean', 'decode_head.lateral_convs.2.bn.running_var', 'decode_head.lateral_convs.2.bn.num_batches_tracked', 'decode_head.fpn_convs.0.conv.weight', 'decode_head.fpn_convs.0.bn.weight', 'decode_head.fpn_convs.0.bn.bias', 'decode_head.fpn_convs.0.bn.running_mean', 'decode_head.fpn_convs.0.bn.running_var', 'decode_head.fpn_convs.0.bn.num_batches_tracked', 'decode_head.fpn_convs.1.conv.weight', 'decode_head.fpn_convs.1.bn.weight', 'decode_head.fpn_convs.1.bn.bias', 'decode_head.fpn_convs.1.bn.running_mean', 'decode_head.fpn_convs.1.bn.running_var', 'decode_head.fpn_convs.1.bn.num_batches_tracked', 'decode_head.fpn_convs.2.conv.weight', 'decode_head.fpn_convs.2.bn.weight', 'decode_head.fpn_convs.2.bn.bias', 'decode_head.fpn_convs.2.bn.running_mean', 'decode_head.fpn_convs.2.bn.running_var', 'decode_head.fpn_convs.2.bn.num_batches_tracked', 'decode_head.fpn_bottleneck.conv.weight', 'decode_head.fpn_bottleneck.bn.weight', 'decode_head.fpn_bottleneck.bn.bias', 'decode_head.fpn_bottleneck.bn.running_mean', 'decode_head.fpn_bottleneck.bn.running_var', 'decode_head.fpn_bottleneck.bn.num_batches_tracked', 'auxiliary_head.conv_seg.weight', 'auxiliary_head.conv_seg.bias', 'auxiliary_head.convs.0.conv.weight', 'auxiliary_head.convs.0.bn.weight', 'auxiliary_head.convs.0.bn.bias', 'auxiliary_head.convs.0.bn.running_mean', 'auxiliary_head.convs.0.bn.running_var', 'auxiliary_head.convs.0.bn.num_batches_tracked']) 2023-05-30 10:40:27,877 - mmseg - INFO - initialize UPerHead with init_cfg {'type': 'Normal', 'std': 0.01, 'override': {'name': 'conv_seg'}} 2023-05-30 10:40:28,036 - mmseg - INFO - initialize FCNHead with init_cfg {'type': 'Normal', 'std': 0.01, 'override': {'name': 'conv_seg'}}

Here is my configuration file:

--------------------------------------------------------

InternImage

Copyright (c) 2022 OpenGVLab

Licensed under The MIT License [see LICENSE for details]

--------------------------------------------------------

base = [ '../base/models/upernet_r50.py', '../base/datasets/geneva_line.py', '../base/default_runtime.py', '../base/schedules/schedule_160k.py' ]

pretrained = '/scratch/izar/shanli/Cadmap/internimage/upernet_internimage_b_512x1024_160k_geneva_line/best_mIoU_iter_42000.pth' model = dict( backbone=dict( delete=True, type='InternImage', core_op='DCNv3', channels=112, depths=[4, 4, 21, 4], groups=[7, 14, 28, 56], mlp_ratio=4., drop_path_rate=0.4, norm_layer='LN', layer_scale=1.0, offset_scale=1.0, post_norm=True, with_cp=False, out_indices=(0, 1, 2, 3), init_cfg=dict(type='Pretrained', checkpoint=pretrained)), decode_head=dict(num_classes=2, in_channels=[112, 224, 448, 896], loss_decode=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0, )),

            #  loss_decode=dict(type='FocalLoss',
            #                   class_weight=[0.00001, 0.99999],
            #                   gamma=2.0,
            #                   use_sigmoid=True,
            #                   loss_weight=1.0)),z
auxiliary_head=dict(num_classes=2, in_channels=448),
test_cfg=dict(mode='whole')

) optimizer = dict( delete=True, type='AdamW', lr=2e-7, betas=(0.9, 0.999), weight_decay=0.05, constructor='CustomLayerDecayOptimizerConstructor', paramwise_cfg=dict(num_layers=33, layer_decay_rate=1.0, depths=[4, 4, 21, 4])) lr_config = dict(delete=True, policy='poly', warmup='linear', warmup_iters=1500, warmup_ratio=2e-7, power=1.0, min_lr=0.0, by_epoch=False)

By default, models are trained on 8 GPUs with 2 images per GPU

data=dict(samples_per_gpu=2) runner = dict(type='IterBasedRunner') checkpoint_config = dict(by_epoch=False, interval=2000) evaluation = dict(interval=2000, metric='mIoU', save_best='mIoU')

fp16 = dict(loss_scale=dict(init_scale=512))

Could you help me figure out what happened? It works with the segformer_internimage_l_512x1024_160k, but not Upernet :(