SamsungLabs / iterdet

[S+SSPR2020] IterDet: Iterative Scheme for Object Detection in Crowded Environments
https://arxiv.org/abs/2005.05708
Mozilla Public License 2.0
210 stars 39 forks source link

Tried Training with a custom dataset. Returns ValueError: low >= high #23

Closed BLOO69 closed 4 years ago

BLOO69 commented 4 years ago

model settings

model = dict( type='IterDetFasterRCNN', pretrained='torchvision://resnet50', backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=-1, norm_cfg=dict(type='BN', requires_grad=True), norm_eval=True, style='pytorch'), neck=dict( type='FPN', in_channels=[256, 512, 1024, 2048], out_channels=256, num_outs=5, norm_cfg=dict(type='BN')), rpn_head=dict( type='RPNHead', in_channels=256, feat_channels=256, anchor_generator=dict( type='AnchorGenerator', scales=[8], ratios=[1.0, 1.5, 2.0, 2.5, 3.0], strides=[4, 8, 16, 32, 64]), bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[.0, .0, .0, .0], target_stds=[1.0, 1.0, 1.0, 1.0]), loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), loss_bbox=dict(type='L1Loss', loss_weight=1.0), final_crop=False), roi_head=dict( type='StandardRoIHead', bbox_roi_extractor=dict( type='SingleRoIExtractor', roi_layer=dict(type='RoIAlign', out_size=7, sample_num=0), out_channels=256, featmap_strides=[4, 8, 16, 32]), bbox_head=dict( type='Shared2FCBBoxHead', in_channels=256, fc_out_channels=1024, roi_feat_size=7, num_classes=2, bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0., 0., 0., 0.], target_stds=[0.1, 0.1, 0.2, 0.2]), reg_class_agnostic=True, loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='L1Loss', loss_weight=1.0), final_crop=False)))

model training and testing settings

train_cfg = dict( rpn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.7, neg_iou_thr=0.3, min_pos_iou=0.3, match_low_quality=True, ignore_iof_thr=0.5), sampler=dict( type='RandomSampler', num=256, pos_fraction=0.5, neg_pos_ub=-1, add_gt_as_proposals=False), allowed_border=-1, pos_weight=-1, debug=False), rpn_proposal=dict( nms_across_levels=False, nms_pre=2000, nms_post=2000, max_num=2000, nms_thr=0.7, min_bbox_size=0), rcnn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.5, neg_iou_thr=0.5, min_pos_iou=0.5, match_low_quality=False, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=512, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False)) test_cfg = dict( rpn=dict( nms_across_levels=False, nms_pre=1000, nms_post=1000, max_num=1000, nms_thr=0.7, min_bbox_size=0), rcnn=dict( score_thr=0.01, nms=dict(type='nms', iou_thr=0.5), max_per_img=1000), n_iterations=2 )

dataset settings

dataset_type = 'CustomDataset' classes = ('helmet', 'head') data_root = '/home/student/Documents/iterdet/data/VOCdevkit/' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='Resize', img_scale=[(1000, 600), (1666, 1000)], keep_ratio=True, final_crop=False), dict(type='RandomFlip', flip_ratio=0.5), dict(type='Normalize', img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='AddHistory'), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'history', 'gt_bboxes', 'gt_labels', 'gt_bboxes_ignore']), ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']), ]) ] data = dict( samples_per_gpu=2, workers_per_gpu=2, train=dict( type=dataset_type, classes=classes, ann_file=data_root + 'voc28_trainval.pkl', img_prefix=data_root , pipeline=train_pipeline), val=dict( type=dataset_type, classes=classes, ann_file=data_root + 'voc28_test.pkl', img_prefix=data_root , pipeline=test_pipeline), test=dict( type=dataset_type, classes=classes, ann_file=data_root + 'voc28_test.pkl', img_prefix=data_root , pipeline=test_pipeline))

optimizer

optimizer = dict( type='Adam', lr=.0001 ) optimizer_config = dict(grad_clip=None)

learning policy

lr_config = dict( policy='step', step=[16, 22]) checkpoint_config = dict(interval=1)

yapf:disable

log_config = dict( interval=50, hooks=[ dict(type='TextLoggerHook'),

dict(type='TensorboardLoggerHook')

])

yapf:enable

runtime settings

total_epochs = 100 dist_params = dict(backend='nccl') log_level = 'INFO' work_dir = '' load_from = None resume_from = None workflow = [('train', 1)]

Procedure

1) Converted VOC to pkl file with pascal_voc.py 2) Change num_classes to 2 from iterdet/crowd_human_full_faster_rcnn_r50_fpn_2x.py 3) Change dataset type to 'Custom Dataset' and classes 4) Change train, val and test data (Train on trainval set - Val and Test is test set)

Problem

During Training, the model is able to train for a certain number of steps but is unable to complete an epoch. The training will be stopped at some point with this ValueError. Any idea on what causes this? Thank You!

Error Traceback

Traceback (most recent call last): File "train.py", line 159, in main() File "train.py", line 155, in main meta=meta) File "/home/student/Documents/iterdet/mmdet/apis/train.py", line 165, in train_detector runner.run(data_loaders, cfg.workflow, cfg.total_epochs) File "/home/student/anaconda3/envs/TRT/lib/python3.7/site-packages/mmcv/runner/runner.py", line 383, in run epoch_runner(data_loaders[i], **kwargs) File "/home/student/anaconda3/envs/TRT/lib/python3.7/site-packages/mmcv/runner/runner.py", line 278, in train for i, data_batch in enumerate(data_loader): File "/home/student/anaconda3/envs/TRT/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in next data = self._next_data() File "/home/student/anaconda3/envs/TRT/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data return self._process_data(data) File "/home/student/anaconda3/envs/TRT/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data data.reraise() File "/home/student/anaconda3/envs/TRT/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) ValueError: Caught ValueError in DataLoader worker process 1. Original Traceback (most recent call last): File "/home/student/anaconda3/envs/TRT/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "/home/student/anaconda3/envs/TRT/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/student/anaconda3/envs/TRT/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/student/Documents/iterdet/mmdet/datasets/custom.py", line 140, in getitem data = self.prepare_train_img(idx) File "/home/student/Documents/iterdet/mmdet/datasets/custom.py", line 153, in prepare_train_img return self.pipeline(results) File "/home/student/Documents/iterdet/mmdet/datasets/pipelines/compose.py", line 25, in call data = t(data) File "/home/student/Documents/iterdet/mmdet/datasets/pipelines/formating.py", line 207, in call n_old_objects = np.random.randint(0, n_objects) File "mtrand.pyx", line 747, in numpy.random.mtrand.RandomState.randint File "_bounded_integers.pyx", line 1270, in numpy.random._bounded_integers._rand_int64 ValueError: low >= high

filaPro commented 4 years ago

Hi @BLOO69 ,

The problem here is that you have some images containing 0 ground truth objects. The simplest solution is to remove such images from training part of your dataset or skip them in the dataloader.

BLOO69 commented 4 years ago

Troubleshooting

{'img_info': {'filename': 'VOC2028/JPEGImages/000257.jpg', 'width': 600, 'height': 441, 'ann': {'bboxes': array([], shape=(0, 4), dtype=float32), 'labels': array([], dtype=int64), 'bboxes_ignore': array([[362., 42., 396., 76.]], dtype=float32), 'labels_ignore': array([0])}}, 'ann_info': {'bboxes': array([], shape=(0, 4), dtype=float32), 'labels': array([], dtype=int64), 'bboxes_ignore': array([[362., 42., 396., 76.]], dtype=float32), 'labels_ignore': array([0])}, 'img_prefix': '/home/student/Documents/iterdet/data/VOCdevkit/', 'seg_prefix': None, 'proposal_file': None, 'bbox_fields': ['gt_bboxes_ignore', 'gt_bboxes'], 'mask_fields': [], 'seg_fields': [], 'filename': '/home/student/Documents/iterdet/data/VOCdevkit/VOC2028/JPEGImages/000257.jpg', 'img': array([[[-0.55955136, -0.512605 , -0.95041394], [-0.31980482, -0.267507 , -0.688976 ], [-0.08005828, -0.02240894, -0.4449673 ], ..., [ 0. , 0. , 0. ], [ 0. , 0. , 0. ], [ 0. , 0. , 0. ]],

   [[-0.28555533, -0.24999999, -0.65411764],
    [-0.13143253, -0.09243695, -0.49725488],
    [-0.02868401,  0.03011207, -0.37525052],
    ...,
    [ 0.        ,  0.        ,  0.        ],
    [ 0.        ,  0.        ,  0.        ],
    [ 0.        ,  0.        ,  0.        ]],

   [[-0.08005828, -0.02240894, -0.39267972],
    [-0.02868401,  0.03011207, -0.34039214],
    [-0.06293352,  0.01260506, -0.35782132],
    ...,
    [ 0.        ,  0.        ,  0.        ],
    [ 0.        ,  0.        ,  0.        ],
    [ 0.        ,  0.        ,  0.        ]],

   ...,

   [[ 0.        ,  0.        ,  0.        ],
    [ 0.        ,  0.        ,  0.        ],
    [ 0.        ,  0.        ,  0.        ],
    ...,
    [ 0.        ,  0.        ,  0.        ],
    [ 0.        ,  0.        ,  0.        ],
    [ 0.        ,  0.        ,  0.        ]],

   [[ 0.        ,  0.        ,  0.        ],
    [ 0.        ,  0.        ,  0.        ],
    [ 0.        ,  0.        ,  0.        ],
    ...,
    [ 0.        ,  0.        ,  0.        ],
    [ 0.        ,  0.        ,  0.        ],
    [ 0.        ,  0.        ,  0.        ]],

   [[ 0.        ,  0.        ,  0.        ],
    [ 0.        ,  0.        ,  0.        ],
    [ 0.        ,  0.        ,  0.        ],
    ...,
    [ 0.        ,  0.        ,  0.        ],
    [ 0.        ,  0.        ,  0.        ],
    [ 0.        ,  0.        ,  0.        ]]], dtype=float32), 'img_shape': (669, 910, 3), 'ori_shape': (441, 600, 3), 'pad_shape': (672, 928, 3), 'scale_factor': array([1.5166667, 1.5170068, 1.5166667, 1.5170068], dtype=float32), 'img_norm_cfg': {'mean': array([123.675, 116.28 , 103.53 ], dtype=float32), 'std': array([58.395, 57.12 , 57.375], dtype=float32), 'to_rgb': True}, 'gt_bboxes': array([], shape=(0, 4), dtype=float32), 'gt_bboxes_ignore': array([[309.40002 ,  63.714283, 360.96667 , 115.29251 ]], dtype=float32), 'gt_labels': array([], dtype=int64), 'scale': (1417, 669), 'scale_idx': None, 'keep_ratio': True, 'flip': True, 'flip_direction': 'horizontal', 'pad_fixed_size': None, 'pad_size_divisor': 32}

Analysis

An annotation only has 1 gt box which is labelled difficult. Found the error by printing results in formatting.py if n_objects == 0. It seems that the bbox was ignored, hence making the total ground truth to be 0. May i know why the difficult boxes are ignored? Thank you! @filaPro

Solution

Made changes in pascal_voc to ignore difficult flag. if difficult: bboxes.append(bbox) labels.append(label) else: bboxes.append(bbox) labels.append(label)

Would this fix be appropriate?

filaPro commented 4 years ago

VOCDataset is implemented by authors of mmdetection, and i think they are following the official PASCAL VOC evaluation protocol. In case you want to train with these difficult boxes your solution is ok.

BLOO69 commented 4 years ago

Thank you for your help! Really appreciate it!