open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark
https://mmdetection.readthedocs.io
Apache License 2.0
29.21k stars 9.4k forks source link

How to change num_class when you use pretrained model #222

Closed pandigreat closed 5 years ago

pandigreat commented 5 years ago

i want to use the fasterRCNN_FPN to do detection. But the num_class of pretrained model on coco dataset is 81, mine dataset just has one class, And it throw an exception.

File "/home/ed/anaconda2/envs/python36/lib/python3.6/site-packages/mmcv/runner/checkpoint.py", line 54, in load_state_dict param.size())) RuntimeError: While copying the parameter named bbox_head.fc_cls.weight, whose dimensions in the model are torch.Size([1, 1024]) and whose dimensions in the checkpoint are torch.Size([81, 1024]).

my config.py is as follow:

model settings

model = dict( type='FasterRCNN', pretrained='modelzoo://resnet50',

pretrained='../work_dirs/faster_rcnn_r50_fpn_1x/epoch_8.pth',

backbone=dict(
    type='ResNet',
    depth=50,
    num_stages=4,
    out_indices=(0, 1, 2, 3),
    frozen_stages=1,
    style='pytorch'),
neck=dict(
    type='FPN',
    in_channels=[256, 512, 1024, 2048],
    out_channels=256,
    num_outs=5),
rpn_head=dict(
    type='RPNHead',
    in_channels=256,
    feat_channels=256,
    anchor_scales=[8],
    anchor_ratios=[0.5, 1.0, 2.0],
    anchor_strides=[4, 8, 16, 32, 64],
    target_means=[.0, .0, .0, .0],
    target_stds=[1.0, 1.0, 1.0, 1.0],
    use_sigmoid_cls=True),
bbox_roi_extractor=dict(
    type='SingleRoIExtractor',
    roi_layer=dict(type='RoIAlign', out_size=7, sample_num=2),
    out_channels=256,
    featmap_strides=[4, 8, 16, 32]),
bbox_head=dict(
    type='SharedFCBBoxHead',
    num_fcs=2,
    in_channels=256,
    fc_out_channels=1024,
    roi_feat_size=7,
    num_classes=1,
    target_means=[0., 0., 0., 0.],
    target_stds=[0.1, 0.1, 0.2, 0.2],
    reg_class_agnostic=False))

model training and testing settings

train_cfg = dict( rpn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.7, neg_iou_thr=0.3, min_pos_iou=0.3, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=256, pos_fraction=0.5, neg_pos_ub=-1, add_gt_as_proposals=False), allowed_border=0, pos_weight=-1, smoothl1_beta=1 / 9.0, debug=False), rcnn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.5, neg_iou_thr=0.5, min_pos_iou=0.5, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=512, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False)) test_cfg = dict( rpn=dict( nms_across_levels=False, nms_pre=2000, nms_post=2000, max_num=2000, nms_thr=0.7, min_bbox_size=0), rcnn=dict( score_thr=0.05, nms=dict(type='nms', iou_thr=0.5), max_per_img=100)

soft-nms is also supported for rcnn testing

# e.g., nms=dict(type='soft_nms', iou_thr=0.5, min_score=0.05)

)

dataset settings

dataset_type = 'CocoDataset' data_root = './data/coco/' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) data = dict( imgs_per_gpu=2, workers_per_gpu=2, train=dict( type=dataset_type, ann_file=data_root + 'annotations/train2017.json', img_prefix=data_root + 'train2017/', img_scale=(1333, 800), img_norm_cfg=img_norm_cfg, size_divisor=32, flip_ratio=0.5, with_mask=False, with_crowd=True, with_label=True), val=dict( type=dataset_type, ann_file=data_root + 'annotations/val2017.json', img_prefix=data_root + 'val2017/', img_scale=(1333, 800), img_norm_cfg=img_norm_cfg, size_divisor=32, flip_ratio=0, with_mask=False, with_crowd=True, with_label=True), test=dict( type=dataset_type, ann_file=data_root + 'annotations/val2017.json', img_prefix=data_root + 'val2017/', img_scale=(1333, 800), img_norm_cfg=img_norm_cfg, size_divisor=32, flip_ratio=0, with_mask=False, with_label=False, test_mode=True))

optimizer

optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))

learning policy

lr_config = dict( policy='step', warmup='linear', warmup_iters=500, warmup_ratio=1.0 / 3, step=[8, 11]) checkpoint_config = dict(interval=2)

yapf:disable

log_config = dict( interval=50, hooks=[ dict(type='TextLoggerHook'),

dict(type='TensorboardLoggerHook')

])

yapf:enable

runtime settings

total_epochs = 30 dist_params = dict(backend='nccl') log_level = 'INFO' work_dir = './new_dirs' load_from = None resume_from = None workflow = [('train', 1)]

wangg12 commented 5 years ago

I suggest you to load a pretrained model and remove the last layers related to num_class and save as a new pretrained model. Then you can use the new saved model for different number of classes.

pandigreat commented 5 years ago

Ok. Thanks~

Curry1201 commented 5 years ago

Can you tell me how to modify it in order to successfully change num_class? Thank you!

Luodian commented 5 years ago

I suggest you to load a pretrained model and remove the last layers related to num_class and save as a new pretrained model. Then you can use the new saved model for different number of classes.

Can u tell us how to correctly remove last layers? really thankful~