Open charles3349 opened 3 years ago
mmdetection v1.0 cascade rcnn模型 转成v2后finetune,与直接在mmdet v1.0 finetune对比,发现v2版本 loss刚开始为8.1,第一个epoch后map 为74.8。
2021-07-24 16:36:45,713 - mmdet - INFO - Epoch [1][100/14009] lr: 4.653e-05, eta: 13 days, 22:43:14, time: 2.868, data_time: 0.802, memory: 25562, loss_rpn_cls: 0.0145, loss_rpn_bbox: 0.0251, s0.loss_cls: 4.3144, s0.acc: 45.0299, s0.loss_bbox: 0.1248, s1.loss_cls: 2.4995, s1.acc: 35.3806, s1.loss_bbox: 0.2038, s2.loss_cls: 1.3757, s2.acc: 26.1961, s2.loss_bbox: 0.1660, loss: 8.7236, grad_norm: 40.1798 INFO:mmdet:Epoch [1][100/14009] lr: 4.653e-05, eta: 13 days, 22:43:14, time: 2.868, data_time: 0.802, memory: 25562, loss_rpn_cls: 0.0145, loss_rpn_bbox: 0.0251, s0.loss_cls: 4.3144, s0.acc: 45.0299, s0.loss_bbox: 0.1248, s1.loss_cls: 2.4995, s1.acc: 35.3806, s1.loss_bbox: 0.2038, s2.loss_cls: 1.3757, s2.acc: 26.1961, s2.loss_bbox: 0.1660, loss: 8.7236, grad_norm: 40.1798 v1.0 直接finetune,刚开始loss为0.6,第一个epoch map为78,已经收敛 2021-05-31 12:09:32,203 - INFO - Epoch [1][100/15757] lr: 0.00005, eta: 2 days, 18:25:47, time: 1.899, data_time: 0.274, memory: 16455, loss_rpn_cls: 0.0119, loss_rpn_bbox: 0.0203, s0.loss_cls: 0.1438, s0.acc: 94.4299, s0.loss_bbox: 0.0904, s1.loss_cls: 0.0782, s1.acc: 93.8354, s1.loss_bbox: 0.1670, s2.loss_cls: 0.0401, s2.acc: 93.6331, s2.loss_bbox: 0.1367, loss: 0.6883 两次训练数据都一样,v1的config如下
2021-07-24 16:36:45,713 - mmdet - INFO - Epoch [1][100/14009] lr: 4.653e-05, eta: 13 days, 22:43:14, time: 2.868, data_time: 0.802, memory: 25562, loss_rpn_cls: 0.0145, loss_rpn_bbox: 0.0251, s0.loss_cls: 4.3144, s0.acc: 45.0299, s0.loss_bbox: 0.1248, s1.loss_cls: 2.4995, s1.acc: 35.3806, s1.loss_bbox: 0.2038, s2.loss_cls: 1.3757, s2.acc: 26.1961, s2.loss_bbox: 0.1660, loss: 8.7236, grad_norm: 40.1798 INFO:mmdet:Epoch [1][100/14009] lr: 4.653e-05, eta: 13 days, 22:43:14, time: 2.868, data_time: 0.802, memory: 25562, loss_rpn_cls: 0.0145, loss_rpn_bbox: 0.0251, s0.loss_cls: 4.3144, s0.acc: 45.0299, s0.loss_bbox: 0.1248, s1.loss_cls: 2.4995, s1.acc: 35.3806, s1.loss_bbox: 0.2038, s2.loss_cls: 1.3757, s2.acc: 26.1961, s2.loss_bbox: 0.1660, loss: 8.7236, grad_norm: 40.1798
2021-05-31 12:09:32,203 - INFO - Epoch [1][100/15757] lr: 0.00005, eta: 2 days, 18:25:47, time: 1.899, data_time: 0.274, memory: 16455, loss_rpn_cls: 0.0119, loss_rpn_bbox: 0.0203, s0.loss_cls: 0.1438, s0.acc: 94.4299, s0.loss_bbox: 0.0904, s1.loss_cls: 0.0782, s1.acc: 93.8354, s1.loss_bbox: 0.1670, s2.loss_cls: 0.0401, s2.acc: 93.6331, s2.loss_bbox: 0.1367, loss: 0.6883
model= dict( type='CascadeRCNN', num_stages=3, #pretrained='/media/senter/0a0d2450-80d0-4ab4-87c4-45613e982e6b/Models/pretrain_models/open-mmlab/resnext101_64x4d-ee2c6f71.pth', backbone=dict( type='ResNeXt', depth=101, groups=64, base_width=4, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=1, style='pytorch'), neck=dict( type='FPN', in_channels=[256, 512, 1024, 2048], out_channels=256, num_outs=5), rpn_head=dict( type='RPNHead', in_channels=256, feat_channels=256, anchor_scales=[8, 16, 32, 64], anchor_ratios=[0.5, 1.0, 2.0], anchor_strides=[4, 8, 16, 32, 64], target_means=[.0, .0, .0, .0], target_stds=[1.0, 1.0, 1.0, 1.0], loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0)), bbox_roi_extractor=dict( type='SingleRoIExtractor', roi_layer=dict(type='RoIAlign', out_size=7, sample_num=2), out_channels=256, featmap_strides=[4, 8, 16, 32]), bbox_head=[ dict( type='SharedFCBBoxHead', num_fcs=2, in_channels=256, fc_out_channels=1024, roi_feat_size=7, num_classes=11, target_means=[0., 0., 0., 0.], target_stds=[0.1, 0.1, 0.2, 0.2], reg_class_agnostic=True, loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)), dict( type='SharedFCBBoxHead', num_fcs=2, in_channels=256, fc_out_channels=1024, roi_feat_size=7, num_classes=11, target_means=[0., 0., 0., 0.], target_stds=[0.05, 0.05, 0.1, 0.1], reg_class_agnostic=True, loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)), dict( type='SharedFCBBoxHead', num_fcs=2, in_channels=256, fc_out_channels=1024, roi_feat_size=7, num_classes=11, target_means=[0., 0., 0., 0.], target_stds=[0.033, 0.033, 0.067, 0.067], reg_class_agnostic=True, loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)) ]) # model training and testing settings train_cfg = dict( rpn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.7, neg_iou_thr=0.3, min_pos_iou=0.3, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=256, pos_fraction=0.5, neg_pos_ub=-1, add_gt_as_proposals=False), allowed_border=0, pos_weight=-1, debug=False), rpn_proposal=dict( nms_across_levels=False, nms_pre=2000, nms_post=2000, max_num=2000, nms_thr=0.7, min_bbox_size=0), rcnn=[ dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.4, neg_iou_thr=0.4, min_pos_iou=0.4, ignore_iof_thr=-1), sampler=dict( type='OHEMSampler', num=512, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False), dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.5, neg_iou_thr=0.5, min_pos_iou=0.5, ignore_iof_thr=-1), sampler=dict( type='OHEMSampler', num=512, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False), dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.6, neg_iou_thr=0.6, min_pos_iou=0.6, ignore_iof_thr=-1), sampler=dict( type='OHEMSampler', num=512, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False) ], stage_loss_weights=[1, 0.5, 0.25]) test_cfg = dict( rpn=dict( nms_across_levels=False, nms_pre=1000, nms_post=1000, max_num=1000, nms_thr=0.7, min_bbox_size=0), rcnn=dict( score_thr=0.05, nms=dict(type='soft_nms', iou_thr=0.5, min_score=0.05), max_per_img=100), keep_all_stages=False) # dataset settings dataset_type = 'VOCDataset' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='Resize', img_scale=[(1000, 800),(1600,1200)], keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']), ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1600, 1200), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']), ]) ] data = dict( imgs_per_gpu=3, workers_per_gpu=16, train=dict( type=dataset_type, ann_file='/media/senter/disk2/ShuDianTongDao/00_TrainData/imagesets/20210531_train.txt', img_prefix='/media/senter', pipeline=train_pipeline), val=dict( type=dataset_type, ann_file='/media/senter/0a0d2450-80d0-4ab4-87c4-45613e982e6b/ShuDianYinHuanDataset/00_DatasetComplete/01TrainTestDataset/EvaluateDataset20200912-useless/FillNiaoHai16781/images.txt', img_prefix='/media/senter', pipeline=test_pipeline), test=dict( type=dataset_type, ann_file='/media/senter/0a0d2450-80d0-4ab4-87c4-45613e982e6b/ShuDianYinHuanDataset/00_DatasetComplete/01TrainTestDataset/EvaluateDataset20200912-useless/FillNiaoHai16781/images.txt', img_prefix='/media/senter', pipeline=test_pipeline)) # optimizer optimizer = dict(type='SGD', lr=0.0001, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2)) # learning policy lr_config = dict( policy='step', warmup='linear', warmup_iters=500, warmup_ratio=1.0 / 3, step=[3, 5]) checkpoint_config = dict(interval=1) # yapf:disable log_config = dict( interval=100, hooks=[ dict(type='TextLoggerHook'), dict(type='TensorboardLoggerHook') ]) # yapf:enable # runtime settings total_epochs = 8 dist_params = dict(backend='nccl') log_level = 'INFO' work_dir = '/media/senter/0a0d2450-80d0-4ab4-87c4-45613e982e6b/Models/trained_models/ShuDianTongDao/20210531_cascade_rcnn_x101_64x4d_fpn_old_10cls' load_from = None#'/media/senter/0a0d2450-80d0-4ab4-87c4-45613e982e6b/Models/trained_models/ShuDianTongDao/20210227_cascade_rcnn_x101_64x4d_fpn_old_10cls/epoch_13.pth' resume_from = '/media/senter/0a0d2450-80d0-4ab4-87c4-45613e982e6b/Models/trained_models/ShuDianTongDao/20210531_cascade_rcnn_x101_64x4d_fpn_old_10cls/epoch_1.pth' workflow = [('train', `1)]`
v2 config为
`model = dict( type='CascadeRCNN', pretrained= '/media/senter/0a0d2450-80d0-4ab4-87c4-45613e982e6b/Models/pretrain_models/open-mmlab/resnext101_64x4d-ee2c6f71.pth', backbone=dict( type='ResNeXt', depth=101, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=1, style='pytorch', groups=64, base_width=4), neck=dict( type='FPN', in_channels=[256, 512, 1024, 2048], out_channels=256, num_outs=5), rpn_head=dict( type='RPNHead', in_channels=256, feat_channels=256, anchor_generator=dict( type='AnchorGenerator', scales=[8, 16, 32, 64], ratios=[0.5, 1.0, 2.0], strides=[4, 8, 16, 32, 64]), bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[1.0, 1.0, 1.0, 1.0]), loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), loss_bbox=dict( type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)), roi_head=dict( type='CascadeRoIHead', num_stages=3, stage_loss_weights=[1, 0.5, 0.25], bbox_roi_extractor=dict( type='SingleRoIExtractor', roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0), out_channels=256, featmap_strides=[4, 8, 16, 32]), bbox_head=[ dict( type='Shared2FCBBoxHead', in_channels=256, fc_out_channels=1024, roi_feat_size=7, num_classes=10, bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[0.1, 0.1, 0.2, 0.2]), reg_class_agnostic=True, loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)), dict( type='Shared2FCBBoxHead', in_channels=256, fc_out_channels=1024, roi_feat_size=7, num_classes=10, bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[0.05, 0.05, 0.1, 0.1]), reg_class_agnostic=True, loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)), dict( type='Shared2FCBBoxHead', in_channels=256, fc_out_channels=1024, roi_feat_size=7, num_classes=10, bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[0.033, 0.033, 0.067, 0.067]), reg_class_agnostic=True, loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)) ])) train_cfg = dict( rpn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.7, neg_iou_thr=0.3, min_pos_iou=0.3, match_low_quality=True, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=256, pos_fraction=0.5, neg_pos_ub=-1, add_gt_as_proposals=False), allowed_border=0, pos_weight=-1, debug=False), rpn_proposal=dict( nms_across_levels=False, nms_pre=2000, nms_post=2000, max_per_img=2000, nms=dict(type='nms', iou_threshold=0.7), min_bbox_size=0), rcnn=[ dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.4, neg_iou_thr=0.4, min_pos_iou=0.4, match_low_quality=False, ignore_iof_thr=-1), sampler=dict( type='OHEMSampler', num=512, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False), dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.5, neg_iou_thr=0.5, min_pos_iou=0.5, match_low_quality=False, ignore_iof_thr=-1), sampler=dict( type='OHEMSampler', num=512, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False), dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.6, neg_iou_thr=0.6, min_pos_iou=0.6, match_low_quality=False, ignore_iof_thr=-1), sampler=dict( type='OHEMSampler', num=512, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False) ], stage_loss_weights=[1, 0.5, 0.25]) test_cfg = dict( rpn=dict( nms_across_levels=False, nms_pre=1000, nms_post=1000, max_per_img=1000, nms=dict(type='nms', iou_threshold=0.7), min_bbox_size=0), rcnn=dict( score_thr=0.05, nms=dict(type='soft_nms', iou_threshold=0.5, min_score=0.05), max_per_img=100), keep_all_stages=False) dataset_type = 'VOCDataset' data_root = '/media/senter' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict( type='Resize', img_scale=[(1000, 800), (1600, 1200)], #multiscale_mode='range', keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1600, 1200), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ] data = dict( samples_per_gpu=3, workers_per_gpu=16, train=dict( type='VOCDataset', ann_file= '/media/senter/disk2/ShuDianTongDao/00_TrainData/imagesets/20210608_train_.txt', img_prefix='/media/senter', pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict( type='Resize', img_scale=[(1000, 800), (1600, 1200)], #multiscale_mode='range', keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ]), val=dict( type='VOCDataset', ann_file= '/media/senter/0a0d2450-80d0-4ab4-87c4-45613e982e6b/ShuDianYinHuanDataset/00_DatasetComplete/01TrainTestDataset/EvaluateDataset20200912-useless/FillNiaoHai16781/images.txt', img_prefix='/media/senter', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1600, 1200), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ]), test=dict( type='VOCDataset', ann_file='home/bj/mmdetection_202105/images2.txt', img_prefix='/media/senter', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1600, 1200), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ])) evaluation = dict(interval=1, metric='mAP') optimizer = dict(type='SGD', lr=0.0001, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2)) lr_config = dict( policy='step', warmup='linear', warmup_iters=500, warmup_ratio=0.3333333333333333, step=[10, 15]) runner = dict(type='EpochBasedRunner', max_epochs=30) checkpoint_config = dict(interval=1) log_config = dict(interval=100, hooks=[dict(type='TextLoggerHook')]) custom_hooks = [dict(type='NumClassCheckHook')] dist_params = dict(backend='nccl') log_level = 'INFO' load_from = "2020_mm_v44.pth" resume_from = None#"2020_mm_v44.pth" workflow = [('train', 1)] work_dir = './work_dirs/jxgs_from_v44' gpu_ids = range(0, 4)
你好,请教一下,我现在想在我的模型中使用TTFNet的TTFHead,我应该择那么操作,我现在的虚拟环境中已经安装好了mmdet,我是否需要在自己的项目中将mmdet文件夹放在我自己的项目里边,具体放在什么位置,应该如何调用,希望你能给一点点建议,非常感谢
mmdetection v1.0 cascade rcnn模型 转成v2后finetune,与直接在mmdet v1.0 finetune对比,发现v2版本 loss刚开始为8.1,第一个epoch后map 为74.8。
2021-07-24 16:36:45,713 - mmdet - INFO - Epoch [1][100/14009] lr: 4.653e-05, eta: 13 days, 22:43:14, time: 2.868, data_time: 0.802, memory: 25562, loss_rpn_cls: 0.0145, loss_rpn_bbox: 0.0251, s0.loss_cls: 4.3144, s0.acc: 45.0299, s0.loss_bbox: 0.1248, s1.loss_cls: 2.4995, s1.acc: 35.3806, s1.loss_bbox: 0.2038, s2.loss_cls: 1.3757, s2.acc: 26.1961, s2.loss_bbox: 0.1660, loss: 8.7236, grad_norm: 40.1798 INFO:mmdet:Epoch [1][100/14009] lr: 4.653e-05, eta: 13 days, 22:43:14, time: 2.868, data_time: 0.802, memory: 25562, loss_rpn_cls: 0.0145, loss_rpn_bbox: 0.0251, s0.loss_cls: 4.3144, s0.acc: 45.0299, s0.loss_bbox: 0.1248, s1.loss_cls: 2.4995, s1.acc: 35.3806, s1.loss_bbox: 0.2038, s2.loss_cls: 1.3757, s2.acc: 26.1961, s2.loss_bbox: 0.1660, loss: 8.7236, grad_norm: 40.1798
v1.0 直接finetune,刚开始loss为0.6,第一个epoch map为78,已经收敛2021-05-31 12:09:32,203 - INFO - Epoch [1][100/15757] lr: 0.00005, eta: 2 days, 18:25:47, time: 1.899, data_time: 0.274, memory: 16455, loss_rpn_cls: 0.0119, loss_rpn_bbox: 0.0203, s0.loss_cls: 0.1438, s0.acc: 94.4299, s0.loss_bbox: 0.0904, s1.loss_cls: 0.0782, s1.acc: 93.8354, s1.loss_bbox: 0.1670, s2.loss_cls: 0.0401, s2.acc: 93.6331, s2.loss_bbox: 0.1367, loss: 0.6883
两次训练数据都一样,v1的config如下v2 config为