open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark
https://mmdetection.readthedocs.io
Apache License 2.0
29.53k stars 9.46k forks source link

RandomResize and HorizontalBox compatabilities #11405

Closed Data-drone closed 9 months ago

Data-drone commented 9 months ago

I followed this tutorial and it works fine: https://github.com/open-mmlab/mmdetection/blob/main/demo/MMDet_Tutorial.ipynb

My installs are: "mmengine==0.10.2" "mmcv==2.1.0" "mmdet==3.3.0"

When I try to explore how the train pipeline works manually I get:

site-packages/mmcv/transforms/processing.py:199, in Resize._resize_bboxes(self, results)
    197 """Resize bounding boxes with ``results['scale_factor']``."""
    198 if results.get('gt_bboxes', None) is not None:
--> 199     bboxes = results['gt_bboxes'] * np.tile(
    200         np.array(results['scale_factor']), 2)
    201     if self.clip_object_border:
    202         bboxes[:, 0::2] = np.clip(bboxes[:, 0::2], 0,
    203                                   results['img_shape'][1])

TypeError: unsupported operand type(s) for *: 'HorizontalBoxes' and 'float'

In my testing, I am initiating the pipeline in a notebook with the following python code:

from mmdet.datasets import CocoDataset

from mmdet.datasets.transforms import (
    CachedMosaic, RandomCrop, YOLOXHSVRandomAug, RandomFlip, LoadAnnotations,
    Pad, CachedMixUp, PackDetInputs, Resize, PackDetInputs
) 

from mmcv.transforms import RandomResize, LoadImageFromFile

train_transforms = [
  LoadImageFromFile(),
  LoadAnnotations(),
  CachedMosaic(
    img_scale=(640, 640),
    pad_val=114.0,
    max_cached_images=20,
    random_pop=False
  ),
  RandomResize(
    scale=(1280, 1280),
    ratio_range=(0.5, 2.0),
    keep_ratio=True
  ),
  RandomCrop(
    crop_size=(640, 640)
  ),
   YOLOXHSVRandomAug(),
   RandomFlip(
     prob=0.5
   ),
   Pad(
     size=(640, 640), 
     pad_val=dict(img=(114, 114, 114))
   ),
   CachedMixUp(
     img_scale=(640, 640),
     ratio_range=(1.0, 1.0),
     max_cached_images=10,
     random_pop=False,
     pad_val=(114, 114, 114),
     prob=0.5
   ),
   PackDetInputs()
]

train_dataset = CocoDataset(
  data_root = data_path,
  ann_file = ann_file,
  data_prefix = data_prefix,
  pipeline = train_transforms 
)

for itr, batch in enumerate(test):
  print(batch)
  break

Why does it work when I run the full thing from the dictionary configs with runner but not when I try to explore the pipeline in python? Is there something missing from the HorizontalBox?

Data-drone commented 9 months ago

Okay I worked it because mmdet requires a different way to resize and the correct way to set this is to have resize configured as:

  RandomResize(
    scale=(1280, 1280),
    ratio_range=(0.5, 2.0),
    keep_ratio=True,
    resize_type='mmdet.datasets.transforms.Resize'
  ),

When setting the pipeline up with config rather than python then the resize type was being set by the Registry module I guess