open-mmlab / mmrazor

OpenMMLab Model Compression Toolbox and Benchmark.
https://mmrazor.readthedocs.io/en/latest/
Apache License 2.0
1.47k stars 227 forks source link

KeyError: "GeneralDistill: 'bbox_head.gfl_cls'" #174

Open 1234532314342 opened 2 years ago

1234532314342 commented 2 years ago

2022-06-07 10:58:14,781 - mmdet - INFO - Set random seed to 979972066, deterministic: False Traceback (most recent call last): File "C:\ProgramData\Anaconda3\lib\site-packages\mmcv\utils\registry.py", line 52, in build_from_cfg return obj_cls(args) File "D:\mmrazor-master\mmrazor-master\mmrazor\models\algorithms\general_distill.py", line 23, in init super(GeneralDistill, self).init(kwargs) File "D:\mmrazor-master\mmrazor-master\mmrazor\models\algorithms\base.py", line 57, in init self._init_distiller(distiller) File "D:\mmrazor-master\mmrazor-master\mmrazor\models\algorithms\base.py", line 135, in _init_distiller self.distiller.prepare_from_student(self.architecture) File "D:\mmrazor-master\mmrazor-master\mmrazor\models\distillers\single_teacher.py", line 123, in prepare_from_student student_module = self.student_name2module[student_module_name] KeyError: 'bbox_head.gfl_cls'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "D:/mmrazor-master/mmrazor-master/tools/mmdet/train_mmdet.py", line 214, in main() File "D:/mmrazor-master/mmrazor-master/tools/mmdet/train_mmdet.py", line 187, in main algorithm = build_algorithm(cfg.algorithm) File "D:\mmrazor-master\mmrazor-master\mmrazor\models\builder.py", line 20, in build_algorithm return ALGORITHMS.build(cfg) File "C:\ProgramData\Anaconda3\lib\site-packages\mmcv\utils\registry.py", line 212, in build return self.build_func(*args, **kwargs, registry=self) File "C:\ProgramData\Anaconda3\lib\site-packages\mmcv\cnn\builder.py", line 27, in build_model_from_cfg return build_from_cfg(cfg, registry, default_args) File "C:\ProgramData\Anaconda3\lib\site-packages\mmcv\utils\registry.py", line 55, in build_from_cfg raise type(e)(f'{obj_cls.name}: {e}') KeyError: "GeneralDistill: 'bbox_head.gfl_cls'"

I don't know why, hope to get help

pppppM commented 2 years ago

This represents where you want to calculate the distillation loss in the model. If you modify the teacher or student, you also need to modify the module name accordingly.

1234532314342 commented 2 years ago

This represents where you want to calculate the distillation loss in the model. If you modify the teacher or student, you also need to modify the module name accordingly.

Hello, how should I modify it? I see that the official example has not changed the name of this piece.

wutongshenqiu commented 2 years ago

Hi, 1234532314342. Let's take this config as an example.

The value of student_module https://github.com/open-mmlab/mmrazor/blob/71a196490bec6864669fdd06d72903e84bb6a382/configs/distill/cwd/cwd_cls_head_gfl_r101_fpn_gfl_r50_fpn_1x_coco.py#L128

must correspond to the module name in student model, https://github.com/open-mmlab/mmrazor/blob/71a196490bec6864669fdd06d72903e84bb6a382/configs/distill/cwd/cwd_cls_head_gfl_r101_fpn_gfl_r50_fpn_1x_coco.py#L27

and this is also the same for teacher_module. https://github.com/open-mmlab/mmrazor/blob/71a196490bec6864669fdd06d72903e84bb6a382/configs/distill/cwd/cwd_cls_head_gfl_r101_fpn_gfl_r50_fpn_1x_coco.py#L129

So, if you change the student config or the teacher config, the student_module and teacher_module in distiller config might also be changed accordingly.

Besides, if it is difficult to get the module name only from the model config, you can try

# print student model
print(algorithm.architecture.model)
# print teacher model
print(algorithm.distiller.teacher)

after building the algorithm

1234532314342 commented 2 years ago

Hi, 1234532314342. Let's take this config as an example.

The value of student_module

https://github.com/open-mmlab/mmrazor/blob/71a196490bec6864669fdd06d72903e84bb6a382/configs/distill/cwd/cwd_cls_head_gfl_r101_fpn_gfl_r50_fpn_1x_coco.py#L128

must correspond to the module name in student model,

https://github.com/open-mmlab/mmrazor/blob/71a196490bec6864669fdd06d72903e84bb6a382/configs/distill/cwd/cwd_cls_head_gfl_r101_fpn_gfl_r50_fpn_1x_coco.py#L27

and this is also the same for teacher_module.

https://github.com/open-mmlab/mmrazor/blob/71a196490bec6864669fdd06d72903e84bb6a382/configs/distill/cwd/cwd_cls_head_gfl_r101_fpn_gfl_r50_fpn_1x_coco.py#L129

So, if you change the student config or the teacher config, the student_module and teacher_module in distiller config might also be changed accordingly.

Besides, if it is difficult to get the module name only from the model config, you can try

# print student model
print(algorithm.architecture.model)
# print teacher model
print(algorithm.distiller.teacher)

after building the algorithm

Hi, 1234532314342. Let's take this config as an example.

The value of student_module

https://github.com/open-mmlab/mmrazor/blob/71a196490bec6864669fdd06d72903e84bb6a382/configs/distill/cwd/cwd_cls_head_gfl_r101_fpn_gfl_r50_fpn_1x_coco.py#L128

must correspond to the module name in student model,

https://github.com/open-mmlab/mmrazor/blob/71a196490bec6864669fdd06d72903e84bb6a382/configs/distill/cwd/cwd_cls_head_gfl_r101_fpn_gfl_r50_fpn_1x_coco.py#L27

and this is also the same for teacher_module.

https://github.com/open-mmlab/mmrazor/blob/71a196490bec6864669fdd06d72903e84bb6a382/configs/distill/cwd/cwd_cls_head_gfl_r101_fpn_gfl_r50_fpn_1x_coco.py#L129

So, if you change the student config or the teacher config, the student_module and teacher_module in distiller config might also be changed accordingly.

Besides, if it is difficult to get the module name only from the model config, you can try

# print student model
print(algorithm.architecture.model)
# print teacher model
print(algorithm.distiller.teacher)

after building the algorithm

It has a new problem at the time of printing AttributeError: 'dict' object has no attribute 'architecture'

wutongshenqiu commented 2 years ago

Did you print after building algorithm(after following line)? https://github.com/open-mmlab/mmrazor/blob/71a196490bec6864669fdd06d72903e84bb6a382/tools/mmdet/train_mmdet.py?rgh-link-date=2022-06-07T14%3A25%3A16Z#L183

You can try commenting components attribute in order to avoid above KeyError when printing. https://github.com/open-mmlab/mmrazor/blob/71a196490bec6864669fdd06d72903e84bb6a382/configs/distill/cwd/cwd_cls_head_gfl_r101_fpn_gfl_r50_fpn_1x_coco.py?rgh-link-date=2022-06-07T14%3A53%3A55Z#L126-L135

1234532314342 commented 2 years ago
print(algorithm.architecture.model)

The same error is reported. How can I determine which output layer to print? The model prints out too much

wutongshenqiu commented 2 years ago

Just specify the layer and print it like in pytorch(this reference might be helpful), since the teacher model or student model actually inherits from nn.Module.

And I have no idea what the same error here means? Does it refer to the AttributeError or KeyError? If it is KeyError, did you comment(or delete) components attribute in distill config as refered above?

1234532314342 commented 2 years ago

it refer to the AttributeError

it refer to the AttributeError.I'm having too much trouble with this problem.

wutongshenqiu commented 2 years ago

Would you mind providing the full config and environment information so that we can do a better analysis?

1234532314342 commented 2 years ago

Would you mind providing the full config and environment information so that we can do a better analysis?

Yes, can you even remotely see what remote software you use

wutongshenqiu commented 2 years ago

🤔 Hi, please share the config file, image like this, instead of just error if possible.

1234532314342 commented 2 years ago

🤔 Hi, please share the config file, image like this, instead of just error if possible.

data = dict( samples_per_gpu=1, workers_per_gpu=1, train=dict( type='CocoDataset', ann_file='D:\mmdetection-master\mmdet\data\labelme-data\coco_newDET1\annotations\train.json', img_prefix='D:\mmdetection-master\mmdet\data\labelme-data\coco_newDET1\train', pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='RandomFlip', flip_ratio=0.5), dict( type='AutoAugment', policies=[[{ 'type': 'Resize', 'img_scale': [(480, 1333), (512, 1333), (544, 1333), (576, 1333), (608, 1333), (640, 1333), (672, 1333), (704, 1333), (736, 1333), (768, 1333), (800, 1333)], 'multiscale_mode': 'value', 'keep_ratio': True }], [{ 'type': 'Resize', 'img_scale': [(400, 4200), (500, 4200), (600, 4200)], 'multiscale_mode': 'value', 'keep_ratio': True }, { 'type': 'RandomCrop', 'crop_type': 'absolute_range', 'crop_size': (384, 600), 'allow_negative_crop': True }, { 'type': 'Resize', 'img_scale': [(480, 1333), (512, 1333), (544, 1333), (576, 1333), (608, 1333), (640, 1333), (672, 1333), (704, 1333), (736, 1333), (768, 1333), (800, 1333)], 'multiscale_mode': 'value', 'override': True, 'keep_ratio': True }]]), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=1), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ], filter_empty_gt=False), val=dict( type='CocoDataset', ann_file='D:\mmdetection-master\mmdet\data\labelme-data\coco_newDET1\annotations\val.json', img_prefix='D:\mmdetection-master\mmdet\data\labelme-data\coco_newDET1\val', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=1), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ]), test=dict( type='CocoDataset', ann_file='D:\mmdetection-master\mmdet\data\labelme-data\coco_newDET1\annotations\test.json', img_prefix='D:\mmdetection-master\mmdet\data\labelme-data\coco_newDET1\test', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=1), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ]))

evaluation = dict(interval=1, metric='bbox') optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=None) lr_config = dict( policy='step', warmup='linear', warmup_iters=500, warmup_ratio=0.001, step=[8, 11]) runner = dict(type='EpochBasedRunner', max_epochs=12) checkpoint_config = dict(interval=1) log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')]) custom_hooks = [dict(type='NumClassCheckHook')] dist_params = dict(backend='nccl') log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 1)] student = dict( type='mmdet.DeformableDETR', backbone=dict( type='ResNet', depth=18, num_stages=4, out_indices=(1, 2, 3), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=False), norm_eval=True, style='pytorch', dcn=dict(type='DCNv2', deform_groups=1, fallback_on_stride=False)),

init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')),

neck=dict(
    type='ChannelMapper',
    in_channels=[512, 1024, 2048],
    kernel_size=1,
    out_channels=64,
    act_cfg=None,
    norm_cfg=dict(type='GN', num_groups=32),
    num_outs=4),
bbox_head=dict(
    type='DeformableDETRHead',
    num_query=300,
    num_classes=6,
    in_channels=512,
    sync_cls_avg_factor=True,
    as_two_stage=False,
    transformer=dict(
        type='DeformableDetrTransformer',
        encoder=dict(
            type='DetrTransformerEncoder',
            num_layers=6,
            transformerlayers=dict(
                type='BaseTransformerLayer',
                attn_cfgs=dict(
                    type='MultiScaleDeformableAttention', embed_dims=256),
                feedforward_channels=256,
                ffn_dropout=0.1,
                operation_order=('self_attn', 'norm', 'ffn', 'norm'))),
        decoder=dict(
            type='DeformableDetrTransformerDecoder',
            num_layers=6,
            return_intermediate=True,
            transformerlayers=dict(
                type='DetrTransformerDecoderLayer',
                attn_cfgs=[
                    dict(
                        type='MultiheadAttention',
                        embed_dims=256,
                        num_heads=8,
                        dropout=0.1),
                    dict(
                        type='MultiScaleDeformableAttention',
                        embed_dims=256)
                ],
                feedforward_channels=256,
                ffn_dropout=0.1,
                operation_order=('self_attn', 'norm', 'cross_attn', 'norm',
                                 'ffn', 'norm')))),
    positional_encoding=dict(
        type='SinePositionalEncoding',
        num_feats=128,
        normalize=True,
        offset=-0.5),
    loss_cls=dict(
        type='FocalLoss',
        use_sigmoid=True,
        gamma=2.0,
        alpha=0.25,
        loss_weight=2.0),
    loss_bbox=dict(type='L1Loss', loss_weight=5.0),
    loss_iou=dict(type='GIoULoss', loss_weight=2.0)),
train_cfg=dict(
    augments=dict(type='BatchMixup', alpha=0.2, num_classes=102,
                  prob=1.),

    assigner=dict(
        type='HungarianAssigner',
        cls_cost=dict(type='FocalLossCost', weight=2.0),
        reg_cost=dict(type='BBoxL1Cost', weight=5.0, box_format='xywh'),
        iou_cost=dict(type='IoUCost', iou_mode='giou', weight=2.0))),
test_cfg=dict(max_per_img=100))

teacher = dict( type='mmdet.DeformableDETR', backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(1, 2, 3), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=False), norm_eval=True, style='pytorch', dcn=dict(type='DCNv2', deform_groups=1, fallback_on_stride=False)),

init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')),

neck=dict(
    type='ChannelMapper',
    in_channels=[512, 1024, 2048],
    kernel_size=1,
    out_channels=256,
    act_cfg=None,
    norm_cfg=dict(type='GN', num_groups=32),
    num_outs=4),
bbox_head=dict(
    type='DeformableDETRHead',
    num_query=300,
    num_classes=6,
    in_channels=2048,
    sync_cls_avg_factor=True,
    as_two_stage=False,
    transformer=dict(
        type='DeformableDetrTransformer',
        encoder=dict(
            type='DetrTransformerEncoder',
            num_layers=6,
            transformerlayers=dict(
                type='BaseTransformerLayer',
                attn_cfgs=dict(
                    type='MultiScaleDeformableAttention', embed_dims=256),
                feedforward_channels=1024,
                ffn_dropout=0.1,
                operation_order=('self_attn', 'norm', 'ffn', 'norm'))),
        decoder=dict(
            type='DeformableDetrTransformerDecoder',
            num_layers=6,
            return_intermediate=True,
            transformerlayers=dict(
                type='DetrTransformerDecoderLayer',
                attn_cfgs=[
                    dict(
                        type='MultiheadAttention',
                        embed_dims=256,
                        num_heads=8,
                        dropout=0.1),
                    dict(
                        type='MultiScaleDeformableAttention',
                        embed_dims=256)
                ],
                feedforward_channels=1024,
                ffn_dropout=0.1,
                operation_order=('self_attn', 'norm', 'cross_attn', 'norm',
                                 'ffn', 'norm')))),
    positional_encoding=dict(
        type='SinePositionalEncoding',
        num_feats=128,
        normalize=True,
        offset=-0.5),
    loss_cls=dict(
        type='FocalLoss',
        use_sigmoid=True,
        gamma=2.0,
        alpha=0.25,
        loss_weight=2.0),
    loss_bbox=dict(type='L1Loss', loss_weight=5.0),
    loss_iou=dict(type='GIoULoss', loss_weight=2.0)))

algorithm = dict( type='GeneralDistill', architecture=dict( type='MMDetArchitecture', model=student), distiller=dict( type='SingleTeacherDistiller', teacher=teacher, teacher_trainable=False,

    components=[
        dict(
            student_module='bbox_head.reg_branches.0.4',
            teacher_module='bbox_head.reg_branches.0.4',
            losses=[
                dict(
                    type='ChannelWiseDivergence',
                    name='loss_cwd_cls_head',
                    tau=1,
                    loss_weight=5)
            ])
    ]))

bbox_head.reg_branches.0.4

find_unused_parameters = True work_dir = './work_dirs/cwd_cls_head_gfl_r101_fpn_gfl_r50_fpn_1x_coco' auto_resume = False gpu_ids = [0] I'm sorry. I'm new to this. Is that right?

wutongshenqiu commented 2 years ago

It seems that the train_cfg is missing in your student config, I am not sure if this will cause such error.

1234532314342 commented 2 years ago

It seems that the train_cfg is missing in your student config, I am not sure if this will cause such error.

the train_cfg is in my student config, train_cfg=dict( augments=dict(type='BatchMixup', alpha=0.2, num_classes=102, prob=1.),

    assigner=dict(
        type='HungarianAssigner',
        cls_cost=dict(type='FocalLossCost', weight=2.0),
        reg_cost=dict(type='BBoxL1Cost', weight=5.0, box_format='xywh'),
        iou_cost=dict(type='IoUCost', iou_mode='giou', weight=2.0)))
wutongshenqiu commented 2 years ago

Sorry, I got it wrong, it seems that train_cfg is missing in the teacher config.

Besides, please make sure your teacher model is able to perform forward properly before distillation.

1234532314342 commented 2 years ago

Sorry, I got it wrong, it seems that train_cfg is missing in the teacher config.

Besides, please make sure your teacher model is able to perform forward properly before distillation.

teacher model need train_cfg?

pppppM commented 2 years ago

@wutongshenqiu This issue needs a reply.