jbwang1997 / OBBDetection

OBBDetection is an oriented object detection library, which is based on MMdetection.
Apache License 2.0
522 stars 112 forks source link

关于DoubleHead+OrientedR-CNN的问题 #55

Closed sherwincn closed 2 years ago

sherwincn commented 2 years ago

我们在实现DoubleHead+OrientedR-CNN的时候出现了梯度爆炸的问题,配置文件如下:

base = './faster_rcnn_orpn_r50_fpn_1x_dota10.py'

dataset

dataset_type = 'DOTADataset' data_root = '/private/data/ori_DOTA10_ms/split_ms_dota1_0/' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadOBBAnnotations', with_bbox=True, with_label=True, with_poly_as_mask=True), dict(type='LoadDOTASpecialInfo'), dict(type='Resize', img_scale=(1024, 1024), keep_ratio=True), dict(type='OBBRandomFlip', h_flip_ratio=0.5, v_flip_ratio=0.5), dict(type='Normalize', img_norm_cfg), dict(type='RandomOBBRotate', rotate_after_flip=True, angles=(0, 90), vert_rate=0.5, vert_cls=['roundabout', 'storage-tank']), dict(type='Pad', size_divisor=32), dict(type='DOTASpecialIgnore', ignore_size=2), dict(type='FliterEmpty'), dict(type='Mask2OBB', obb_type='obb'), dict(type='OBBDefaultFormatBundle'), dict(type='OBBCollect', keys=['img', 'gt_bboxes', 'gt_obboxes', 'gt_labels']) ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipRotateAug', img_scale=[(1024, 1024)], h_flip=False, v_flip=False, rotate=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='OBBRandomFlip'), dict(type='Normalize', img_norm_cfg), dict(type='RandomOBBRotate', rotate_after_flip=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='OBBCollect', keys=['img']), ]) ]

model = dict( roi_head=dict( type='OBBDoubleHeadRoIHead', reg_roi_scale_factor=1.2, bbox_head=dict( delete=True, type='OBBDoubleConvFCBBoxHead', start_bbox_type='obb', end_bbox_type='obb', num_convs=4, num_fcs=2, in_channels=256, conv_out_channels=1024, fc_out_channels=1024, roi_feat_size=7, num_classes=15, bbox_coder=dict( type='OBB2OBBDeltaXYWHTCoder', target_means=[0., 0., 0., 0., 0.], target_stds=[0.1, 0.1, 0.2, 0.2, 1]), reg_class_agnostic=False, loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=2.0), loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=2.0))))

data = dict( samples_per_gpu=2, workers_per_gpu=1, train=dict( type=dataset_type, task='Task1', ann_file=data_root + 'trainval/annfiles/', img_prefix=data_root + 'trainval/images/', pipeline=train_pipeline), test=dict( type=dataset_type, task='Task1', ann_file=data_root + 'test/annfiles/', img_prefix=data_root + 'test/images/', pipeline=test_pipeline)) evaluation = None

jbwang1997 commented 2 years ago

这个影响因素很多,可以尝试减少double_head中的loss_weight

sherwincn commented 2 years ago

在您的默认配置下复现您faster R-CNN+DoubleHead的时候, DOTA1.0上依旧出现了梯度爆炸的问题

sherwincn commented 2 years ago

当您的配置文件 dh_faster_rcnn_dota 在单GPU上训练时(lr=0.005,sample_per_gpu=2)不会出现梯度爆炸,当我使用4GPU训练(lr=0.02,sample_per_gpu=2)就会出现梯度爆炸。我按照您的说法在4GPUI的情况下缩小rcnn_head的loss weight,可以正常训练但是会有精度损失。

jbwang1997 commented 2 years ago

多GPU的Double Head我还没有跑过,如果有精度损失的化,可以在原有学习率的条件下尝试加长warmup的iteration数量,看是否还会梯度爆炸。