Open 1234532314342 opened 2 years ago
This represents where you want to calculate the distillation loss in the model. If you modify the teacher or student, you also need to modify the module name accordingly.
This represents where you want to calculate the distillation loss in the model. If you modify the teacher or student, you also need to modify the module name accordingly.
Hello, how should I modify it? I see that the official example has not changed the name of this piece.
Hi, 1234532314342. Let's take this config as an example.
The value of student_module
https://github.com/open-mmlab/mmrazor/blob/71a196490bec6864669fdd06d72903e84bb6a382/configs/distill/cwd/cwd_cls_head_gfl_r101_fpn_gfl_r50_fpn_1x_coco.py#L128
must correspond to the module name in student model, https://github.com/open-mmlab/mmrazor/blob/71a196490bec6864669fdd06d72903e84bb6a382/configs/distill/cwd/cwd_cls_head_gfl_r101_fpn_gfl_r50_fpn_1x_coco.py#L27
and this is also the same for teacher_module
.
https://github.com/open-mmlab/mmrazor/blob/71a196490bec6864669fdd06d72903e84bb6a382/configs/distill/cwd/cwd_cls_head_gfl_r101_fpn_gfl_r50_fpn_1x_coco.py#L129
So, if you change the student config or the teacher config, the student_module
and teacher_module
in distiller config might also be changed accordingly.
Besides, if it is difficult to get the module name only from the model config, you can try
# print student model
print(algorithm.architecture.model)
# print teacher model
print(algorithm.distiller.teacher)
after building the algorithm
Hi, 1234532314342. Let's take this config as an example.
The value of
student_module
must correspond to the module name in student model,
and this is also the same for
teacher_module
.So, if you change the student config or the teacher config, the
student_module
andteacher_module
in distiller config might also be changed accordingly.Besides, if it is difficult to get the module name only from the model config, you can try
# print student model print(algorithm.architecture.model) # print teacher model print(algorithm.distiller.teacher)
after building the algorithm
Hi, 1234532314342. Let's take this config as an example.
The value of
student_module
must correspond to the module name in student model,
and this is also the same for
teacher_module
.So, if you change the student config or the teacher config, the
student_module
andteacher_module
in distiller config might also be changed accordingly.Besides, if it is difficult to get the module name only from the model config, you can try
# print student model print(algorithm.architecture.model) # print teacher model print(algorithm.distiller.teacher)
after building the algorithm
It has a new problem at the time of printing AttributeError: 'dict' object has no attribute 'architecture'
Did you print after building algorithm(after following line)? https://github.com/open-mmlab/mmrazor/blob/71a196490bec6864669fdd06d72903e84bb6a382/tools/mmdet/train_mmdet.py?rgh-link-date=2022-06-07T14%3A25%3A16Z#L183
You can try commenting components
attribute in order to avoid above KeyError when printing.
https://github.com/open-mmlab/mmrazor/blob/71a196490bec6864669fdd06d72903e84bb6a382/configs/distill/cwd/cwd_cls_head_gfl_r101_fpn_gfl_r50_fpn_1x_coco.py?rgh-link-date=2022-06-07T14%3A53%3A55Z#L126-L135
print(algorithm.architecture.model)
The same error is reported. How can I determine which output layer to print? The model prints out too much
Just specify the layer and print it like in pytorch(this reference might be helpful), since the teacher model or student model actually inherits from nn.Module
.
And I have no idea what the same error here means? Does it refer to the AttributeError or KeyError? If it is KeyError, did you comment(or delete) components
attribute in distill config as refered above?
it refer to the AttributeError
it refer to the AttributeError.I'm having too much trouble with this problem.
Would you mind providing the full config and environment information so that we can do a better analysis?
Would you mind providing the full config and environment information so that we can do a better analysis?
Yes, can you even remotely see what remote software you use
🤔 Hi, please share the config file, like this, instead of just error if possible.
🤔 Hi, please share the config file, like this, instead of just error if possible.
data = dict( samples_per_gpu=1, workers_per_gpu=1, train=dict( type='CocoDataset', ann_file='D:\mmdetection-master\mmdet\data\labelme-data\coco_newDET1\annotations\train.json', img_prefix='D:\mmdetection-master\mmdet\data\labelme-data\coco_newDET1\train', pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='RandomFlip', flip_ratio=0.5), dict( type='AutoAugment', policies=[[{ 'type': 'Resize', 'img_scale': [(480, 1333), (512, 1333), (544, 1333), (576, 1333), (608, 1333), (640, 1333), (672, 1333), (704, 1333), (736, 1333), (768, 1333), (800, 1333)], 'multiscale_mode': 'value', 'keep_ratio': True }], [{ 'type': 'Resize', 'img_scale': [(400, 4200), (500, 4200), (600, 4200)], 'multiscale_mode': 'value', 'keep_ratio': True }, { 'type': 'RandomCrop', 'crop_type': 'absolute_range', 'crop_size': (384, 600), 'allow_negative_crop': True }, { 'type': 'Resize', 'img_scale': [(480, 1333), (512, 1333), (544, 1333), (576, 1333), (608, 1333), (640, 1333), (672, 1333), (704, 1333), (736, 1333), (768, 1333), (800, 1333)], 'multiscale_mode': 'value', 'override': True, 'keep_ratio': True }]]), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=1), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ], filter_empty_gt=False), val=dict( type='CocoDataset', ann_file='D:\mmdetection-master\mmdet\data\labelme-data\coco_newDET1\annotations\val.json', img_prefix='D:\mmdetection-master\mmdet\data\labelme-data\coco_newDET1\val', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=1), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ]), test=dict( type='CocoDataset', ann_file='D:\mmdetection-master\mmdet\data\labelme-data\coco_newDET1\annotations\test.json', img_prefix='D:\mmdetection-master\mmdet\data\labelme-data\coco_newDET1\test', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=1), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ]))
evaluation = dict(interval=1, metric='bbox') optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=None) lr_config = dict( policy='step', warmup='linear', warmup_iters=500, warmup_ratio=0.001, step=[8, 11]) runner = dict(type='EpochBasedRunner', max_epochs=12) checkpoint_config = dict(interval=1) log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')]) custom_hooks = [dict(type='NumClassCheckHook')] dist_params = dict(backend='nccl') log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 1)] student = dict( type='mmdet.DeformableDETR', backbone=dict( type='ResNet', depth=18, num_stages=4, out_indices=(1, 2, 3), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=False), norm_eval=True, style='pytorch', dcn=dict(type='DCNv2', deform_groups=1, fallback_on_stride=False)),
neck=dict(
type='ChannelMapper',
in_channels=[512, 1024, 2048],
kernel_size=1,
out_channels=64,
act_cfg=None,
norm_cfg=dict(type='GN', num_groups=32),
num_outs=4),
bbox_head=dict(
type='DeformableDETRHead',
num_query=300,
num_classes=6,
in_channels=512,
sync_cls_avg_factor=True,
as_two_stage=False,
transformer=dict(
type='DeformableDetrTransformer',
encoder=dict(
type='DetrTransformerEncoder',
num_layers=6,
transformerlayers=dict(
type='BaseTransformerLayer',
attn_cfgs=dict(
type='MultiScaleDeformableAttention', embed_dims=256),
feedforward_channels=256,
ffn_dropout=0.1,
operation_order=('self_attn', 'norm', 'ffn', 'norm'))),
decoder=dict(
type='DeformableDetrTransformerDecoder',
num_layers=6,
return_intermediate=True,
transformerlayers=dict(
type='DetrTransformerDecoderLayer',
attn_cfgs=[
dict(
type='MultiheadAttention',
embed_dims=256,
num_heads=8,
dropout=0.1),
dict(
type='MultiScaleDeformableAttention',
embed_dims=256)
],
feedforward_channels=256,
ffn_dropout=0.1,
operation_order=('self_attn', 'norm', 'cross_attn', 'norm',
'ffn', 'norm')))),
positional_encoding=dict(
type='SinePositionalEncoding',
num_feats=128,
normalize=True,
offset=-0.5),
loss_cls=dict(
type='FocalLoss',
use_sigmoid=True,
gamma=2.0,
alpha=0.25,
loss_weight=2.0),
loss_bbox=dict(type='L1Loss', loss_weight=5.0),
loss_iou=dict(type='GIoULoss', loss_weight=2.0)),
train_cfg=dict(
augments=dict(type='BatchMixup', alpha=0.2, num_classes=102,
prob=1.),
assigner=dict(
type='HungarianAssigner',
cls_cost=dict(type='FocalLossCost', weight=2.0),
reg_cost=dict(type='BBoxL1Cost', weight=5.0, box_format='xywh'),
iou_cost=dict(type='IoUCost', iou_mode='giou', weight=2.0))),
test_cfg=dict(max_per_img=100))
teacher = dict( type='mmdet.DeformableDETR', backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(1, 2, 3), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=False), norm_eval=True, style='pytorch', dcn=dict(type='DCNv2', deform_groups=1, fallback_on_stride=False)),
neck=dict(
type='ChannelMapper',
in_channels=[512, 1024, 2048],
kernel_size=1,
out_channels=256,
act_cfg=None,
norm_cfg=dict(type='GN', num_groups=32),
num_outs=4),
bbox_head=dict(
type='DeformableDETRHead',
num_query=300,
num_classes=6,
in_channels=2048,
sync_cls_avg_factor=True,
as_two_stage=False,
transformer=dict(
type='DeformableDetrTransformer',
encoder=dict(
type='DetrTransformerEncoder',
num_layers=6,
transformerlayers=dict(
type='BaseTransformerLayer',
attn_cfgs=dict(
type='MultiScaleDeformableAttention', embed_dims=256),
feedforward_channels=1024,
ffn_dropout=0.1,
operation_order=('self_attn', 'norm', 'ffn', 'norm'))),
decoder=dict(
type='DeformableDetrTransformerDecoder',
num_layers=6,
return_intermediate=True,
transformerlayers=dict(
type='DetrTransformerDecoderLayer',
attn_cfgs=[
dict(
type='MultiheadAttention',
embed_dims=256,
num_heads=8,
dropout=0.1),
dict(
type='MultiScaleDeformableAttention',
embed_dims=256)
],
feedforward_channels=1024,
ffn_dropout=0.1,
operation_order=('self_attn', 'norm', 'cross_attn', 'norm',
'ffn', 'norm')))),
positional_encoding=dict(
type='SinePositionalEncoding',
num_feats=128,
normalize=True,
offset=-0.5),
loss_cls=dict(
type='FocalLoss',
use_sigmoid=True,
gamma=2.0,
alpha=0.25,
loss_weight=2.0),
loss_bbox=dict(type='L1Loss', loss_weight=5.0),
loss_iou=dict(type='GIoULoss', loss_weight=2.0)))
algorithm = dict( type='GeneralDistill', architecture=dict( type='MMDetArchitecture', model=student), distiller=dict( type='SingleTeacherDistiller', teacher=teacher, teacher_trainable=False,
components=[
dict(
student_module='bbox_head.reg_branches.0.4',
teacher_module='bbox_head.reg_branches.0.4',
losses=[
dict(
type='ChannelWiseDivergence',
name='loss_cwd_cls_head',
tau=1,
loss_weight=5)
])
]))
find_unused_parameters = True work_dir = './work_dirs/cwd_cls_head_gfl_r101_fpn_gfl_r50_fpn_1x_coco' auto_resume = False gpu_ids = [0] I'm sorry. I'm new to this. Is that right?
It seems that the train_cfg
is missing in your student config, I am not sure if this will cause such error.
It seems that the
train_cfg
is missing in your student config, I am not sure if this will cause such error.
the train_cfg is in my student config, train_cfg=dict( augments=dict(type='BatchMixup', alpha=0.2, num_classes=102, prob=1.),
assigner=dict(
type='HungarianAssigner',
cls_cost=dict(type='FocalLossCost', weight=2.0),
reg_cost=dict(type='BBoxL1Cost', weight=5.0, box_format='xywh'),
iou_cost=dict(type='IoUCost', iou_mode='giou', weight=2.0)))
Sorry, I got it wrong, it seems that train_cfg is missing in the teacher config.
Besides, please make sure your teacher model is able to perform forward properly before distillation.
Sorry, I got it wrong, it seems that train_cfg is missing in the teacher config.
Besides, please make sure your teacher model is able to perform forward properly before distillation.
teacher model need train_cfg?
@wutongshenqiu This issue needs a reply.
2022-06-07 10:58:14,781 - mmdet - INFO - Set random seed to 979972066, deterministic: False Traceback (most recent call last): File "C:\ProgramData\Anaconda3\lib\site-packages\mmcv\utils\registry.py", line 52, in build_from_cfg return obj_cls(args) File "D:\mmrazor-master\mmrazor-master\mmrazor\models\algorithms\general_distill.py", line 23, in init super(GeneralDistill, self).init(kwargs) File "D:\mmrazor-master\mmrazor-master\mmrazor\models\algorithms\base.py", line 57, in init self._init_distiller(distiller) File "D:\mmrazor-master\mmrazor-master\mmrazor\models\algorithms\base.py", line 135, in _init_distiller self.distiller.prepare_from_student(self.architecture) File "D:\mmrazor-master\mmrazor-master\mmrazor\models\distillers\single_teacher.py", line 123, in prepare_from_student student_module = self.student_name2module[student_module_name] KeyError: 'bbox_head.gfl_cls'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "D:/mmrazor-master/mmrazor-master/tools/mmdet/train_mmdet.py", line 214, in
main()
File "D:/mmrazor-master/mmrazor-master/tools/mmdet/train_mmdet.py", line 187, in main
algorithm = build_algorithm(cfg.algorithm)
File "D:\mmrazor-master\mmrazor-master\mmrazor\models\builder.py", line 20, in build_algorithm
return ALGORITHMS.build(cfg)
File "C:\ProgramData\Anaconda3\lib\site-packages\mmcv\utils\registry.py", line 212, in build
return self.build_func(*args, **kwargs, registry=self)
File "C:\ProgramData\Anaconda3\lib\site-packages\mmcv\cnn\builder.py", line 27, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "C:\ProgramData\Anaconda3\lib\site-packages\mmcv\utils\registry.py", line 55, in build_from_cfg
raise type(e)(f'{obj_cls.name}: {e}')
KeyError: "GeneralDistill: 'bbox_head.gfl_cls'"
I don't know why, hope to get help