open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark
https://mmdetection.readthedocs.io
Apache License 2.0
29.61k stars 9.47k forks source link

when i use type='PhotoMetricDistortion' for Data_augmentation something unusual happened! #3817

Closed Seiano closed 4 years ago

Seiano commented 4 years ago

HI Author! thank you for your Project ! when i use type='PhotoMetricDistortion' for Data_augmentation to tuning my data something unusual happened!

2020-09-22 23:29:57,819 - mmdet - INFO - Saving checkpoint at 3 epochs [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 8/8, 4.9 task/s, elapsed: 2s, ETA: 0s1 num_scales 1 num_scales 1 num_scales 1 num_scales 2020-09-22 23:30:03,467 - mmdet - INFO - +----------+-----+------+--------+-------+ | class | gts | dets | recall | ap | +----------+-----+------+--------+-------+ | red_bull | 30 | 529 | 0.967 | 0.848 | | milk_tea | 32 | 136 | 1.000 | 0.899 | | coca | 32 | 74 | 0.938 | 0.816 | | sprite | 21 | 61 | 0.810 | 0.583 | +----------+-----+------+--------+-------+ | mAP | | | | 0.786 | +----------+-----+------+--------+-------+

2020-09-22 23:30:41,105 - mmdet - INFO - Saving checkpoint at 4 epochs [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 8/8, 5.5 task/s, elapsed: 1s, ETA: 0s1 num_scales 1 num_scales 1 num_scales 1 num_scales 2020-09-22 23:30:46,442 - mmdet - INFO - +----------+-----+------+--------+-------+ | class | gts | dets | recall | ap | +----------+-----+------+--------+-------+ | red_bull | 30 | 0 | 0.000 | 0.000 | | milk_tea | 32 | 0 | 0.000 | 0.000 | | coca | 32 | 0 | 0.000 | 0.000 | | sprite | 21 | 0 | 0.000 | 0.000 | +----------+-----+------+--------+-------+ | mAP | | | | 0.000 | +----------+-----+------+--------+-------+ ........... 2020-09-22 23:33:27,275 - mmdet - INFO - Saving checkpoint at 8 epochs [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 8/8, 5.2 task/s, elapsed: 2s, ETA: 0s1 num_scales 1 num_scales 1 num_scales 1 num_scales 2020-09-22 23:33:33,603 - mmdet - INFO - +----------+-----+------+--------+-------+ | class | gts | dets | recall | ap | +----------+-----+------+--------+-------+ | red_bull | 30 | 0 | 0.000 | 0.000 | | milk_tea | 32 | 0 | 0.000 | 0.000 | | coca | 32 | 0 | 0.000 | 0.000 | | sprite | 21 | 0 | 0.000 | 0.000 | +----------+-----+------+--------+-------+ | mAP | | | | 0.000 | +----------+-----+------+--------+-------+

I donot know why! so many zero zero zero!!!!!!

my config is this python tools/train.py configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py 2020-09-22 23:27:40,526 - mmdet - INFO - Environment info:

sys.platform: linux Python: 3.8.5 | packaged by conda-forge | (default, Aug 29 2020, 01:22:49) [GCC 7.5.0] CUDA available: True CUDA_HOME: :/usr/local/cuda GPU 0: GeForce GTX 1060 GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.6.0 PyTorch compiling details: PyTorch built with:

TorchVision: 0.7.0 OpenCV: 4.4.0 MMCV: 1.1.1 MMDetection: 2.3.0+f93c00f MMDetection Compiler: GCC 7.3 MMDetection CUDA Compiler: 10.1

2020-09-22 23:27:40,791 - mmdet - INFO - Distributed training: False 2020-09-22 23:27:41,053 - mmdet - INFO - Config: model = dict( type='FasterRCNN', pretrained='torchvision://resnet50', backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=True), norm_eval=True, style='pytorch'), neck=dict( type='FPN', in_channels=[256, 512, 1024, 2048], out_channels=256, num_outs=5), rpn_head=dict( type='RPNHead', in_channels=256, feat_channels=256, anchor_generator=dict( type='AnchorGenerator', scales=[8], ratios=[0.5, 1.0, 2.0], strides=[4, 8, 16, 32, 64]), bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[1.0, 1.0, 1.0, 1.0]), loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), loss_bbox=dict(type='L1Loss', loss_weight=1.0)), roi_head=dict( type='StandardRoIHead', bbox_roi_extractor=dict( type='SingleRoIExtractor', roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0), out_channels=256, featmap_strides=[4, 8, 16, 32]), bbox_head=dict( type='Shared2FCBBoxHead', in_channels=256, fc_out_channels=1024, roi_feat_size=7, num_classes=4, bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[0.1, 0.1, 0.2, 0.2]), reg_class_agnostic=False, loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='L1Loss', loss_weight=1.0)))) train_cfg = dict( rpn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.7, neg_iou_thr=0.3, min_pos_iou=0.3, match_low_quality=True, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=256, pos_fraction=0.5, neg_pos_ub=-1, add_gt_as_proposals=False), allowed_border=-1, pos_weight=-1, debug=False), rpn_proposal=dict( nms_across_levels=False, nms_pre=2000, nms_post=1000, max_num=1000, nms_thr=0.7, min_bbox_size=0), rcnn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.5, neg_iou_thr=0.5, min_pos_iou=0.5, match_low_quality=False, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=512, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False)) test_cfg = dict( rpn=dict( nms_across_levels=False, nms_pre=1000, nms_post=1000, max_num=1000, nms_thr=0.7, min_bbox_size=0), rcnn=dict( score_thr=0.05, nms=dict(type='nms', iou_threshold=0.5), max_per_img=100)) dataset_type = 'VOCDataset' data_root = 'data/VOCdevkit/' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile', to_float32=True), dict(type='LoadAnnotations', with_bbox=True), dict(type='Resize', img_scale=(1280, 720), keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict( type='PhotoMetricDistortion', brightness_delta=32, contrast_range=(0.5, 1.5), saturation_range=(0.5, 1.5), hue_delta=18), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ] test_pipeline = [ dict(type='LoadImageFromFile', to_float32=True), dict( type='MultiScaleFlipAug', img_scale=(1280, 720), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ] data = dict( samples_per_gpu=2, workers_per_gpu=0, train=dict( type='RepeatDataset', times=3, dataset=dict( type='VOCDataset', ann_file=['data/VOCdevkit/VOC2007/ImageSets/Main/trainval.txt'], img_prefix=['data/VOCdevkit/VOC2007/'], pipeline=[ dict(type='LoadImageFromFile', to_float32=True), dict(type='LoadAnnotations', with_bbox=True), dict(type='Resize', img_scale=(1280, 720), keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict( type='PhotoMetricDistortion', brightness_delta=32, contrast_range=(0.5, 1.5), saturation_range=(0.5, 1.5), hue_delta=18), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ])), val=dict( type='VOCDataset', ann_file='data/VOCdevkit/VOC2007/ImageSets/Main/test.txt', img_prefix='data/VOCdevkit/VOC2007/', pipeline=[ dict(type='LoadImageFromFile', to_float32=True), dict( type='MultiScaleFlipAug', img_scale=(1280, 720), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ]), test=dict( type='VOCDataset', ann_file='data/VOCdevkit/VOC2007/ImageSets/Main/test.txt', img_prefix='data/VOCdevkit/VOC2007/', pipeline=[ dict(type='LoadImageFromFile', to_float32=True), dict( type='MultiScaleFlipAug', img_scale=(1280, 720), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ])) evaluation = dict(interval=1, metric='mAP') optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=None) lr_config = dict( policy='step', warmup='linear', warmup_iters=500, warmup_ratio=0.001, step=[8, 11]) total_epochs = 12 checkpoint_config = dict(interval=1) log_config = dict( interval=5, hooks=[dict(type='TextLoggerHook'), dict(type='TensorboardLoggerHook')]) dist_params = dict(backend='nccl') log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 1)] work_dir = './work_dirs3/faster_rcnn_r50_fpn_1x_coco' gpu_ids = range(0, 1)

2020-09-22 23:27:41,406 - mmdet - INFO - load model from: torchvision://resnet50 2020-09-22 23:27:42,428 - mmdet - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: fc.weight, fc.bias

look forword to your replay !!!!!!!!!!

yhcao6 commented 4 years ago

If I understand correctly, you are using PhotometricDistortion augmentation to train your dataset, the first 4 epoch seems fine which yield nonzero mAP, but later the map becomes zero. Here are some questions:

  1. Have you tried more? That is, is such case happens 100%?
  2. If remove PhotometricDistortion, will this case happen?
Seiano commented 4 years ago

1 use this PhotometricDistortion augmentation is 100% unusual. 2 remove PhotometricDistortion, It's normal

Seiano commented 4 years ago

Yes! as you said, When i use this method to augrmentation my dataset it always throw out this unusual and is always in epoch 4 no matter what i use model funtuning. i tried faster and reppoints_minmax_r50_fpn_gn-neck+head_1x_coco. The data foemat i use XML(VOC2007). I don't know if it has anything to do with the data.

yhcao6 commented 4 years ago

According to https://github.com/open-mmlab/mmdetection/blob/master/configs/ssd/ssd300_coco.py, the PhotoMetricDistortion should be placed at the beginning of the training pipeline. Please put PhotoMetricDistortion between LoadAnnotations and resize and have a try again.

Seiano commented 4 years ago

2020-09-23 11:03:07,493 - mmdet - INFO - Saving checkpoint at 1 epochs [ ] 0/8, elapsed: 0s, ETA:/home/zhangzhenbo/mmdetection/mmdet/core/post_processing/bbox_nms.py:52: UserWarning: This overload of nonzero is deprecated: nonzero() Consider using one of the following signatures instead: nonzero(*, bool as_tuple) (Triggered internally at /opt/conda/conda-bld/pytorch_1595629411241/work/torch/csrc/utils/python_arg_parser.cpp:766.) labels = valid_mask.nonzero()[:, 1] [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 8/8, 5.2 task/s, elapsed: 2s, ETA: 0s1 num_scales 1 num_scales 1 num_scales 1 num_scales 2020-09-23 11:03:13,051 - mmdet - INFO - +----------+-----+------+--------+-------+ | class | gts | dets | recall | ap | +----------+-----+------+--------+-------+ | red_bull | 30 | 0 | 0.000 | 0.000 | | milk_tea | 32 | 0 | 0.000 | 0.000 | | coca | 32 | 0 | 0.000 | 0.000 | | sprite | 21 | 0 | 0.000 | 0.000 | +----------+-----+------+--------+-------+ | mAP | | | | 0.000 | +----------+-----+------+--------+-------+ ...............

2020-09-23 11:05:12,209 - mmdet - INFO - Saving checkpoint at 4 epochs [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 8/8, 5.4 task/s, elapsed: 1s, ETA: 0s1 num_scales 1 num_scales 1 num_scales 1 num_scales 2020-09-23 11:05:17,868 - mmdet - INFO - +----------+-----+------+--------+-------+ | class | gts | dets | recall | ap | +----------+-----+------+--------+-------+ | red_bull | 30 | 0 | 0.000 | 0.000 | | milk_tea | 32 | 0 | 0.000 | 0.000 | | coca | 32 | 0 | 0.000 | 0.000 | | sprite | 21 | 0 | 0.000 | 0.000 | +----------+-----+------+--------+-------+ | mAP | | | | 0.000 | +----------+-----+------+--------+-------+

From epoch 1 to epoch 4 all of it is 0 .

Seiano commented 4 years ago

When i don't use PhotometricDistortion 2020-09-23 11:11:04,795 - mmdet - INFO - Saving checkpoint at 1 epochs [ ] 0/8, elapsed: 0s, ETA:/home/zhangzhenbo/mmdetection/mmdet/core/post_processing/bbox_nms.py:52: UserWarning: This overload of nonzero is deprecated: nonzero() Consider using one of the following signatures instead: nonzero(*, bool as_tuple) (Triggered internally at /opt/conda/conda-bld/pytorch_1595629411241/work/torch/csrc/utils/python_arg_parser.cpp:766.) labels = valid_mask.nonzero()[:, 1] [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 8/8, 5.1 task/s, elapsed: 2s, ETA: 0s1 num_scales 1 num_scales 1 num_scales 1 num_scales 2020-09-23 11:11:10,436 - mmdet - INFO - +----------+-----+------+--------+-------+ | class | gts | dets | recall | ap | +----------+-----+------+--------+-------+ | red_bull | 30 | 281 | 0.867 | 0.328 | | milk_tea | 32 | 313 | 1.000 | 0.741 | | coca | 32 | 72 | 0.031 | 0.001 | | sprite | 21 | 134 | 0.190 | 0.006 | +----------+-----+------+--------+-------+ | mAP | | | | 0.269 | +----------+-----+------+--------+-------+

2020-09-23 11:11:50,759 - mmdet - INFO - Saving checkpoint at 2 epochs [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 8/8, 4.7 task/s, elapsed: 2s, ETA: 0s1 num_scales 1 num_scales 1 num_scales 1 num_scales 2020-09-23 11:11:56,575 - mmdet - INFO - +----------+-----+------+--------+-------+ | class | gts | dets | recall | ap | +----------+-----+------+--------+-------+ | red_bull | 30 | 159 | 0.967 | 0.897 | | milk_tea | 32 | 137 | 1.000 | 0.986 | | coca | 32 | 153 | 0.812 | 0.659 | | sprite | 21 | 182 | 1.000 | 0.853 | +----------+-----+------+--------+-------+ | mAP | | | | 0.849 | +----------+-----+------+--------+-------+

2020-09-23 11:12:37,251 - mmdet - INFO - Saving checkpoint at 3 epochs [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 8/8, 4.9 task/s, elapsed: 2s, ETA: 0s1 num_scales 1 num_scales 1 num_scales 1 num_scales 2020-09-23 11:12:43,904 - mmdet - INFO - +----------+-----+------+--------+-------+ | class | gts | dets | recall | ap | +----------+-----+------+--------+-------+ | red_bull | 30 | 41 | 0.967 | 0.909 | | milk_tea | 32 | 54 | 1.000 | 1.000 | | coca | 32 | 93 | 0.938 | 0.903 | | sprite | 21 | 104 | 1.000 | 0.938 | +----------+-----+------+--------+-------+ | mAP | | | | 0.938 | +----------+-----+------+--------+-------+

2020-09-23 11:13:24,358 - mmdet - INFO - Saving checkpoint at 4 epochs [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 8/8, 4.9 task/s, elapsed: 2s, ETA: 0s1 num_scales 1 num_scales 1 num_scales 1 num_scales 2020-09-23 11:13:30,014 - mmdet - INFO - +----------+-----+------+--------+-------+ | class | gts | dets | recall | ap | +----------+-----+------+--------+-------+ | red_bull | 30 | 37 | 0.967 | 0.909 | | milk_tea | 32 | 40 | 1.000 | 1.000 | | coca | 32 | 126 | 0.969 | 0.909 | | sprite | 21 | 120 | 0.952 | 0.901 | +----------+-----+------+--------+-------+ | mAP | | | | 0.930 | +----------+-----+------+--------+-------+

Seiano commented 4 years ago

My config is this train_pipeline = [ dict(type='LoadImageFromFile', to_float32=True), dict(type='LoadAnnotations', with_bbox=True), dict(type='PhotoMetricDistortion' ), dict(type='Resize', img_scale=(1280, 720), keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size_divisor=32),

yhcao6 commented 4 years ago

Can you try to train the same imgs with SSD

Seiano commented 4 years ago

No! what's wrong!

Seiano commented 4 years ago

I rember that i use SSD a few days ago, but the data not exactly the same of that.

Seiano commented 4 years ago

When i modify this parames take the same img, same bright、same contrast、 same saturation, I don't Why。

parames dict

dict( type='PhotoMetricDistortion', brightness_delta=32, # 1,8, 10 contrast_range=(0.5, 1.5), # 1 saturation_range=(0.5, 1.5), # 1 hue_delta=18), # 2

get img from transforms.py

def __init__(self,
             brightness_delta=5,
             contrast_range=(0.5,1),
             saturation_range=(0.5, 1),
             hue_delta=18):
    self.brightness_delta = brightness_delta
    self.contrast_lower, self.contrast_upper = contrast_range
    self.saturation_lower, self.saturation_upper = saturation_range
    self.hue_delta = hue_delta

def __call__(self, results):
    """Call function to perform photometric distortion on images.

    Args:
        results (dict): Result dict from loading pipeline.

    Returns:
        dict: Result dict with images distorted.
    """

    if 'img_fields' in results:
        assert results['img_fields'] == ['img'], \
            'Only single img_fields is allowed'
    img = results['img']
    cv2.imshow('img', img)

I

Seiano commented 4 years ago

@yhcao6

yhcao6 commented 4 years ago

This transform we have only tested on SSD, apply it to other models may make it diverge. But at least I apply it to faster rcnn and trained on coco, it is normal at the beginning, here is my config:

_base_ = [
    '../_base_/models/faster_rcnn_r50_fpn.py',
    '../_base_/datasets/coco_detection.py',
    '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
]

img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
    dict(type='LoadImageFromFile', to_float32=True),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(
        type='PhotoMetricDistortion',
        brightness_delta=32,
        contrast_range=(0.5, 1.5),
        saturation_range=(0.5, 1.5),
        hue_delta=18),
    dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
    dict(type='RandomFlip', flip_ratio=0.5),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='Pad', size_divisor=32),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]

data = dict(
    samples_per_gpu=2, workers_per_gpu=2, train=dict(pipeline=train_pipeline))