HikariTJU / LD

Localization Distillation for Object Detection (CVPR 2022, TPAMI 2023)
Apache License 2.0
355 stars 51 forks source link

将LD的骨干网更改为shufflenetV2后的问题 #53

Open cape-zck opened 1 year ago

cape-zck commented 1 year ago

您好,这边我尝试将Student的骨干更换为shufflenetv2后利用ResNet50作为Teacher去蒸馏,采用的检测Head为GFL,结果似乎还不如Shufflenetv2作为骨干GFL从头到尾训练,请问作者有做过类似骨干异构蒸馏的实验么。

HikariTJU commented 1 year ago

可以把log发出来看下吗

cape-zck commented 1 year ago

可以把log发出来看下吗

好的,这边我想了想会不会和我shufflenetv2学生网络设置的FPN融合层数有问题呢,刚接触这方面的工作,可能有些地方理解不周到,还想请作者您多多指教 `2022-11-25 11:53:00,267 - mmdet - INFO - Environment info:

sys.platform: linux Python: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:18) [GCC 10.3.0] CUDA available: True GPU 0: Tesla T4 CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 11.2, V11.2.152 GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 PyTorch: 1.12.0 PyTorch compiling details: PyTorch built with:

TorchVision: 0.2.2 OpenCV: 4.6.0 MMCV: 1.6.2 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 10.2 MMDetection: 2.25.3+e71b499

2022-11-25 11:53:01,841 - mmdet - INFO - Distributed training: False 2022-11-25 11:53:03,406 - mmdet - INFO - Config: dataset_type = 'CocoDataset' data_root = '/home/studio-lab-user/mmdetection/data/minicoco/' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='Resize', img_scale=(1333, 800), keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ] data = dict( samples_per_gpu=16, workers_per_gpu=1, train=dict( type='CocoDataset', ann_file= '/home/studio-lab-user/mmdetection/data/minicoco/annotations/train.json', img_prefix='/home/studio-lab-user/mmdetection/data/minicoco/train2017/', pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='Resize', img_scale=(1333, 800), keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ]), val=dict( type='CocoDataset', ann_file= '/home/studio-lab-user/mmdetection/data/minicoco/annotations/instances_val2017.json', img_prefix='/home/studio-lab-user/mmdetection/data/minicoco/val2017/', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ]), test=dict( type='CocoDataset', ann_file= '/home/studio-lab-user/mmdetection/data/minicoco/annotations/instances_val2017.json', img_prefix='/home/studio-lab-user/mmdetection/data/minicoco/val2017/', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ])) evaluation = dict(interval=1, metric='bbox') optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=None) lr_config = dict( policy='step', warmup='linear', warmup_iters=500, warmup_ratio=0.001, step=[8, 11]) runner = dict(type='EpochBasedRunner', max_epochs=12) checkpoint_config = dict(interval=1) log_config = dict( interval=50, hooks=[dict(type='TextLoggerHook'), dict(type='TensorboardLoggerHook')]) custom_hooks = [dict(type='NumClassCheckHook')] dist_params = dict(backend='nccl') log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 1)] opencv_num_threads = 0 mp_start_method = 'fork' auto_scale_lr = dict(enable=False, base_batch_size=16) teacher_ckpt = '~/mmdetection/tools/work_dirs/gfl_r50_fpn_1x_coco/minicocor50.pth' shufflenetv2_pretrain = 'https://download.openmmlab.com/mmclassification/v0/shufflenet_v2/shufflenet_v2_batch1024_imagenet_20200812-5bf4721e.pth' model = dict( type='KnowledgeDistillationSingleStageDetector', teacher_config='../configs/gfl/gfl_r50_fpn_mstrain_2x_coco.py', teacher_ckpt= '~/mmdetection/tools/work_dirs/gfl_r50_fpn_1x_coco/minicocor50.pth', backbone=dict( widen_factor=1.0, type='ShuffleNetV2', out_indices=(0, 1, 2, 3), norm_eval=True, init_cfg=dict( type='Pretrained', checkpoint= 'https://download.openmmlab.com/mmclassification/v0/shufflenet_v2/shufflenet_v2_batch1024_imagenet_20200812-5bf4721e.pth', prefix='backbone.')), neck=dict( type='FPN', in_channels=[116, 232, 464, 1024], out_channels=256, start_level=1, add_extra_convs='on_output', num_outs=5), bbox_head=dict( type='LDHead', num_classes=80, in_channels=256, stacked_convs=4, feat_channels=256, anchor_generator=dict( type='AnchorGenerator', ratios=[1.0], octave_base_scale=8, scales_per_octave=1, strides=[8, 16, 32, 64, 128]), loss_cls=dict( type='QualityFocalLoss', use_sigmoid=True, beta=2.0, loss_weight=1.0), loss_dfl=dict(type='DistributionFocalLoss', loss_weight=0.25), loss_ld=dict( type='KnowledgeDistillationKLDivLoss', loss_weight=0.25, T=10), reg_max=16, loss_bbox=dict(type='GIoULoss', loss_weight=2.0)), train_cfg=dict( assigner=dict(type='ATSSAssigner', topk=9), allowed_border=-1, pos_weight=-1, debug=False), test_cfg=dict( nms_pre=1000, min_bbox_size=0, score_thr=0.05, nms=dict(type='nms', iou_threshold=0.6), max_per_img=100)) train_dataset = dict( ann_file= '/home/studio-lab-user/mmdetection/data/minicoco/annotations/train.json', img_prefix='/home/studio-lab-user/mmdetection/data/minicoco/train2017/') work_dir = './work_dirs/ld_shufflenetv2_gflv1_r50_fpn_minicoco_1x' auto_resume = False gpu_ids = [0] `

HikariTJU commented 1 year ago

samples per gpu太大了, 你改成2 然后调整一下LR再训练

cape-zck commented 1 year ago

samples per gpu太大了, 你改成2 然后调整一下LR再训练

好的,谢谢