open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark
https://mmdetection.readthedocs.io
Apache License 2.0
29.42k stars 9.43k forks source link

a questions on use albumentations #4005

Closed zkloveai closed 3 years ago

zkloveai commented 3 years ago

configs/albu_example/mask_rcnn_r50_fpn_albu_1x_coco.py in train_pipeline。 train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True, with_mask=True), dict(type='Resize', img_scale=(1333, 800), keep_ratio=True), dict(type='Pad', size_divisor=32), dict( type='Albu', transforms=albu_train_transforms, bbox_params=dict( type='BboxParams', format='pascal_voc', label_fields=['gt_labels'], min_visibility=0.0, filter_lost_elements=True), keymap={ 'img': 'image', 'gt_masks': 'masks', 'gt_bboxes': 'bboxes' }, update_pad_shape=False, skip_img_without_anno=True), dict(type='Normalize', **img_norm_cfg), dict(type='DefaultFormatBundle'), dict( type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'], meta_keys=('filename', 'ori_shape', 'img_shape', 'img_norm_cfg', 'pad_shape', 'scale_factor')) ]

the format='pascal_voc', why ? I think is 'coco'.

RyanXLi commented 3 years ago

Note that this key is part of bbox_params. Their difference is format='pascal_voc' means [x1, y1, x2, y2] style box encoding, while format='coco' means [x, y, w, h].

zkloveai commented 3 years ago

thanks . I know you said.
mask_rcnn_r50_fpn_albu_1x_coco.py this config is not for COCO datasets? if it's for coco. why the format set = 'coco' . in other words. if my dataset is like coco([x, y, w, h]) , the format set = 'coco' . it's true ?

RyanXLi commented 3 years ago

However, the internal representation of bbox in MMDetection here is [x1, y1, x2, y2]. You can refer to coco.py for details.

zkloveai commented 3 years ago

thanks a lot!