open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark
https://mmdetection.readthedocs.io
Apache License 2.0
29.21k stars 9.4k forks source link

The training model seems learning nothing #4072

Closed JonathanAndradeSilva closed 3 years ago

JonathanAndradeSilva commented 3 years ago

Hi every one,

I'm trying to figure out whats is going on with the training because the model is not learning anything. The results at 1 epoch go from 0 acc to 100. I see also some updates regarding COCO dataset, but I do not know if it influential in the process to use MMdetection.

I appreciate your attention,

Please follow my env information (running in colab) and data loader example:

2020-11-06 00:13:46,700 - mmdet - INFO - Environment info:

sys.platform: linux Python: 3.6.9 (default, Oct 8 2020, 12:12:24) [GCC 8.4.0] CUDA available: True GPU 0: Tesla T4 CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 10.1, V10.1.243 GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.7.0+cu101 PyTorch compiling details: PyTorch built with:

TorchVision: 0.8.1+cu101 OpenCV: 4.4.0 MMCV: 1.1.6 MMCV Compiler: GCC 7.5 MMCV CUDA Compiler: 10.1 MMDetection: 2.6.0+

2020-11-06 00:13:47,129 - mmdet - INFO - Distributed training: False 2020-11-06 00:13:47,547 - mmdet - INFO - Config: model = dict( type='CascadeRCNN', pretrained='torchvision://resnet50', backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=True), norm_eval=True, style='pytorch'), neck=dict( type='FPN', in_channels=[256, 512, 1024, 2048], out_channels=256, num_outs=5), rpn_head=dict( type='RPNHead', in_channels=256, feat_channels=256, anchor_generator=dict( type='AnchorGenerator', scales=[8], ratios=[0.5, 1.0, 2.0], strides=[4, 8, 16, 32, 64]), bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[1.0, 1.0, 1.0, 1.0]), loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), loss_bbox=dict( type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)), roi_head=dict( type='CascadeRoIHead', num_stages=3, stage_loss_weights=[1, 0.5, 0.25], bbox_roi_extractor=dict( type='SingleRoIExtractor', roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0), out_channels=256, featmap_strides=[4, 8, 16, 32]), bbox_head=[ dict( type='Shared2FCBBoxHead', in_channels=256, fc_out_channels=1024, roi_feat_size=7, num_classes=1, bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[0.1, 0.1, 0.2, 0.2]), reg_class_agnostic=True, loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)), dict( type='Shared2FCBBoxHead', in_channels=256, fc_out_channels=1024, roi_feat_size=7, num_classes=1, bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[0.05, 0.05, 0.1, 0.1]), reg_class_agnostic=True, loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)), dict( type='Shared2FCBBoxHead', in_channels=256, fc_out_channels=1024, roi_feat_size=7, num_classes=1, bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[0.033, 0.033, 0.067, 0.067]), reg_class_agnostic=True, loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)) ])) train_cfg = dict( rpn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.7, neg_iou_thr=0.3, min_pos_iou=0.3, match_low_quality=True, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=256, pos_fraction=0.5, neg_pos_ub=-1, add_gt_as_proposals=False), allowed_border=0, pos_weight=-1, debug=False), rpn_proposal=dict( nms_across_levels=False, nms_pre=2000, nms_post=2000, max_num=2000, nms_thr=0.7, min_bbox_size=0), rcnn=[ dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.5, neg_iou_thr=0.5, min_pos_iou=0.5, match_low_quality=False, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=512, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False), dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.6, neg_iou_thr=0.6, min_pos_iou=0.6, match_low_quality=False, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=512, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False), dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.7, neg_iou_thr=0.7, min_pos_iou=0.7, match_low_quality=False, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=512, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False) ]) test_cfg = dict( rpn=dict( nms_across_levels=False, nms_pre=1000, nms_post=1000, max_num=1000, nms_thr=0.7, min_bbox_size=0), rcnn=dict( score_thr=0.05, nms=dict(type='nms', iou_threshold=0.5), max_per_img=100)) dataset_type = 'PVDataset' data_root = '/content/drive/My Drive/PV-Dataset/fold_1' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='Resize', img_scale=(1333, 800), keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ] data = dict( samples_per_gpu=2, workers_per_gpu=2, train=dict( type='PVDataset', ann_file='', img_prefix='train/rgb', pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='Resize', img_scale=(1333, 800), keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ], data_root='/content/drive/My Drive/PV-Dataset/fold_1', classes=('PV', )), val=dict( type='PVDataset', ann_file='', img_prefix='validation/rgb', pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='Resize', img_scale=(1333, 800), keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) # The validate parameter of train.py is False ], data_root='/content/drive/My Drive/PV-Dataset/fold_1', classes=('PV', )), test=dict( type='PVDataset', ann_file='', img_prefix='test/rgb', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ], data_root='/content/drive/My Drive/PV-Dataset/fold_1', classes=('PV', ))) evaluation = dict(interval=1, metric='mAP') optimizer = dict(type='SGD', lr=0.00125, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=None) lr_config = dict( policy='step', warmup='linear', warmup_iters=500, warmup_ratio=0.001, step=[8, 11]) total_epochs = 12 checkpoint_config = dict(interval=6.0, create_symlink=False) log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')]) dist_params = dict(backend='nccl') log_level = 'INFO' load_from = '/content/drive/My Drive/AirplaneData/chekpoints/cascade_rcnn_r50_fpn_1x_coco_20200316-3dc56deb.pth' resume_from = None workflow = [('train', 2), ('val', 1)] classes = ('PV', ) work_dir = '/content/drive/My Drive/PV-Dataset/MModels/cascade_rcnn_r50_fpn_1x' seed = 0 gpu_ids = range(0, 1)

/content/drive/My Drive/PV-Dataset/fold_1/ IMGPREFIX /content/drive/My Drive/PV-Dataset/fold_1/train/rgb RGB: /content/drive/My Drive/PV-Dataset/fold_1/train/rgb LABELS: /content/drive/My Drive/PV-Dataset/fold_1/train/labels creating index... index created! /content/drive/My Drive/PV-Dataset/fold_1/ IMGPREFIX /content/drive/My Drive/PV-Dataset/fold_1/validation/rgb RGB: /content/drive/My Drive/PV-Dataset/fold_1/validation/rgb LABELS: /content/drive/My Drive/PV-Dataset/fold_1/validation/labels creating index... index created! 2020-11-06 00:13:48,988 - mmdet - INFO - load model from: torchvision://resnet50 2020-11-06 00:13:49,201 - mmdet - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: fc.weight, fc.bias

2020-11-06 00:13:53,788 - mmdet - INFO - load checkpoint from /content/drive/My Drive/AirplaneData/chekpoints/cascade_rcnn_r50_fpn_1x_coco_20200316-3dc56deb.pth 2020-11-06 00:13:54,186 - mmdet - WARNING - The model and loaded state dict do not match exactly

size mismatch for roi_head.bbox_head.0.fc_cls.weight: copying a param with shape torch.Size([81, 1024]) from checkpoint, the shape in current model is torch.Size([2, 1024]). size mismatch for roi_head.bbox_head.0.fc_cls.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([2]). size mismatch for roi_head.bbox_head.1.fc_cls.weight: copying a param with shape torch.Size([81, 1024]) from checkpoint, the shape in current model is torch.Size([2, 1024]). size mismatch for roi_head.bbox_head.1.fc_cls.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([2]). size mismatch for roi_head.bbox_head.2.fc_cls.weight: copying a param with shape torch.Size([81, 1024]) from checkpoint, the shape in current model is torch.Size([2, 1024]). size mismatch for roi_head.bbox_head.2.fc_cls.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([2]). 2020-11-06 00:13:54,195 - mmdet - INFO - Start running, host: root@4f58aada6d87, work_dir: /content/drive/My Drive/PV-Dataset/MModels/cascade_rcnn_r50_fpn_1x 2020-11-06 00:13:54,197 - mmdet - INFO - workflow: [('train', 2), ('val', 1)], max: 12 epochs 2020-11-06 00:14:28,692 - mmdet - INFO - Epoch [1][50/128] lr: 1.236e-04, eta: 0:17:02, time: 0.688, data_time: 0.049, memory: 10406, loss_rpn_cls: 0.0562, loss_rpn_bbox: 0.0021, s0.loss_cls: 0.3336, s0.acc: 96.3496, s0.loss_bbox: 0.0000, s1.loss_cls: 0.2018, s1.acc: 87.0723, s1.loss_bbox: 0.0000, s2.loss_cls: 0.1115, s2.acc: 77.9980, s2.loss_bbox: 0.0000, loss: 0.7051 2020-11-06 00:14:54,822 - mmdet - INFO - Epoch [1][100/128] lr: 2.485e-04, eta: 0:14:29, time: 0.523, data_time: 0.007, memory: 10406, loss_rpn_cls: 0.0430, loss_rpn_bbox: 0.0030, s0.loss_cls: 0.0149, s0.acc: 100.0000, s0.loss_bbox: 0.0000, s1.loss_cls: 0.0124, s1.acc: 100.0000, s1.loss_bbox: 0.0000, s2.loss_cls: 0.0076, s2.acc: 100.0000, s2.loss_bbox: 0.0000, loss: 0.0810 2020-11-06 00:15:38,316 - mmdet - INFO - Epoch [2][50/128] lr: 4.433e-04, eta: 0:11:19, time: 0.572, data_time: 0.049, memory: 10406, loss_rpn_cls: 0.0197, loss_rpn_bbox: 0.0020, s0.loss_cls: 0.0021, s0.acc: 100.0000, s0.loss_bbox: 0.0000, s1.loss_cls: 0.0016, s1.acc: 100.0000, s1.loss_bbox: 0.0000, s2.loss_cls: 0.0012, s2.acc: 100.0000, s2.loss_bbox: 0.0000, loss: 0.0265 2020-11-06 00:16:04,480 - mmdet - INFO - Epoch [2][100/128] lr: 5.682e-04, eta: 0:11:01, time: 0.523, data_time: 0.007, memory: 10406, loss_rpn_cls: 0.0117, loss_rpn_bbox: 0.0017, s0.loss_cls: 0.0010, s0.acc: 100.0000, s0.loss_bbox: 0.0000, s1.loss_cls: 0.0008, s1.acc: 100.0000, s1.loss_bbox: 0.0000, s2.loss_cls: 0.0006, s2.acc: 100.0000, s2.loss_bbox: 0.0000, loss: 0.0157 2020-11-06 00:16:31,164 - mmdet - INFO - Exp name: myconfig.py 2020-11-06 00:16:31,165 - mmdet - INFO - Epoch(val) [2][44] loss_rpn_cls: 0.0214, loss_rpn_bbox: 0.0019, s0.loss_cls: 0.0005, s0.acc: 100.0000, s0.loss_bbox: 0.0000, s1.loss_cls: 0.0004, s1.acc: 100.0000, s1.loss_bbox: 0.0000, s2.loss_cls: 0.0003, s2.acc: 100.0000, s2.loss_bbox: 0.0000, loss: 0.0244 2020-11-06 00:16:59,694 - mmdet - INFO - Epoch [3][50/128] lr: 7.630e-04, eta: 0:09:37, time: 0.568, data_time: 0.049, memory: 10406, loss_rpn_cls: 0.0103, loss_rpn_bbox: 0.0012, s0.loss_cls: 0.0003, s0.acc: 100.0000, s0.loss_bbox: 0.0000, s1.loss_cls: 0.0003, s1.acc: 100.0000, s1.loss_bbox: 0.0000, s2.loss_cls: 0.0002, s2.acc: 100.0000, s2.loss_bbox: 0.0000, loss: 0.0124 2020-11-06 00:17:25,909 - mmdet - INFO - Epoch [3][100/128] lr: 8.879e-04, eta: 0:09:23, time: 0.524, data_time: 0.007, memory: 10406, loss_rpn_cls: 0.0079, loss_rpn_bbox: 0.0014, s0.loss_cls: 0.0003, s0.acc: 100.0000, s0.loss_bbox: 0.0000, s1.loss_cls: 0.0002, s1.acc: 100.0000, s1.loss_bbox: 0.0000, s2.loss_cls: 0.0002, s2.acc: 100.0000, s2.loss_bbox: 0.0000, loss: 0.0099 2020-11-06 00:18:08,681 - mmdet - INFO - Epoch [4][50/128] lr: 1.083e-03, eta: 0:08:22, time: 0.563, data_time: 0.049, memory: 10406, loss_rpn_cls: 0.0075, loss_rpn_bbox: 0.0011, s0.loss_cls: 0.0002, s0.acc: 100.0000, s0.loss_bbox: 0.0000, s1.loss_cls: 0.0001, s1.acc: 100.0000, s1.loss_bbox: 0.0000, s2.loss_cls: 0.0001, s2.acc: 100.0000, s2.loss_bbox: 0.0000, loss: 0.0090 2020-11-06 00:18:34,728 - mmdet - INFO - Epoch [4][100/128] lr: 1.208e-03, eta: 0:08:07, time: 0.521, data_time: 0.007, memory: 10406, loss_rpn_cls: 0.0042, loss_rpn_bbox: 0.0008, s0.loss_cls: 0.0002, s0.acc: 100.0000, s0.loss_bbox: 0.0000, s1.loss_cls: 0.0001, s1.acc: 100.0000, s1.loss_bbox: 0.0000, s2.loss_cls: 0.0001, s2.acc: 100.0000, s2.loss_bbox: 0.0000, loss: 0.0054 2020-11-06 00:19:01,344 - mmdet - INFO - Exp name: myconfig.py 2020-11-06 00:19:01,347 - mmdet - INFO - Epoch(val) [4][44] loss_rpn_cls: 0.0143, loss_rpn_bbox: 0.0013, s0.loss_cls: 0.0001, s0.acc: 100.0000, s0.loss_bbox: 0.0000, s1.loss_cls: 0.0001, s1.acc: 100.0000, s1.loss_bbox: 0.0000, s2.loss_cls: 0.0001, s2.acc: 100.0000, s2.loss_bbox: 0.0000, loss: 0.0159 2020-11-06 00:19:29,590 - mmdet - INFO - Epoch [5][50/128] lr: 1.250e-03, eta: 0:07:17, time: 0.563, data_time: 0.049, memory: 10406, loss_rpn_cls: 0.0049, loss_rpn_bbox: 0.0008, s0.loss_cls: 0.0001, s0.acc: 100.0000, s0.loss_bbox: 0.0000, s1.loss_cls: 0.0001, s1.acc: 100.0000, s1.loss_bbox: 0.0000, s2.loss_cls: 0.0001, s2.acc: 100.0000, s2.loss_bbox: 0.0000, loss: 0.0060 2020-11-06 00:19:55,615 - mmdet - INFO - Epoch [5][100/128] lr: 1.250e-03, eta: 0:07:00, time: 0.520, data_time: 0.007, memory: 10406, loss_rpn_cls: 0.0038, loss_rpn_bbox: 0.0008, s0.loss_cls: 0.0002, s0.acc: 100.0000, s0.loss_bbox: 0.0000, s1.loss_cls: 0.0001, s1.acc: 100.0000, s1.loss_bbox: 0.0000, s2.loss_cls: 0.0001, s2.acc: 100.0000, s2.loss_bbox: 0.0000, loss: 0.0049

The dataloader info and ann informations:

{'img_metas': DataContainer([[{'filename': '/content/drive/My Drive/PV-Dataset/fold_1/train/rgb/PV_66048_46336_1.png', 'ori_filename': 'PV_66048_46336_1.png', 'ori_shape': (256, 256, 3), 'img_shape': (800, 800, 3), 'pad_shape': (800, 800, 3), 'scale_factor': array([3.125, 3.125, 3.125, 3.125], dtype=float32), 'flip': True, 'flip_direction': 'horizontal', 'img_norm_cfg': {'mean': array([123.675, 116.28 , 103.53 ], dtype=float32), 'std': array([58.395, 57.12 , 57.375], dtype=float32), 'to_rgb': True}}], [{'filename': '/content/drive/My Drive/PV-Dataset/fold_1/train/rgb/PV_35840_52736_1.png', 'ori_filename': 'PV_35840_52736_1.png', 'ori_shape': (256, 256, 3), 'img_shape': (800, 800, 3), 'pad_shape': (800, 800, 3), 'scale_factor': array([3.125, 3.125, 3.125, 3.125], dtype=float32), 'flip': False, 'flip_direction': None, 'img_norm_cfg': {'mean': array([123.675, 116.28 , 103.53 ], dtype=float32), 'std': array([58.395, 57.12 , 57.375], dtype=float32), 'to_rgb': True}}]]), 'img': DataContainer([tensor([[[[ 0.....]]]), tensor([[[[-0.2513,.....]]])]), 'gt_bboxes': DataContainer([[tensor([[603.1250, 275.0000, 653.1250, 325.0000]])], [tensor([[ 0.0000, 81.2500, 31.2500, 131.2500]])]]), 'gt_labels': DataContainer([[tensor([1])], [tensor([1])]])}

yhcao6 commented 3 years ago

What do you mean learn nothing, I can't find the preformance of the model on your valset.

JonathanAndradeSilva commented 3 years ago

The results are empty. I found the problem. I'm using the old version of COCO library. Now, I remove the old library and install the mmpycocotools. Also, my bounding boxes size are less than 32 (which is fixed in CustomDataset). For this, I set filter_empty_gt=False.

There is way to send to CustomDataset some parameters? For example the size of the bounding box.

yhcao6 commented 3 years ago

filter_empty_gt means filter images without any GT instead of filtering images whose GT boxes are too small. If you look for a pipeline that can filter images with too small GTs, you can refer https://github.com/open-mmlab/mmdetection/blob/d3cf38d91c454b1a6881e8c36c1e4a66dc5521b8/mmdet/datasets/pipelines/loading.py#L433-L458 and you can modify the config as follows: https://github.com/open-mmlab/mmdetection/blob/d3cf38d91c454b1a6881e8c36c1e4a66dc5521b8/configs/yolact/yolact_r50_1x8_coco.py#L92-L95