albumentations-team / albumentations

Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
https://albumentations.ai
MIT License
14.17k stars 1.64k forks source link

Non-Axis Aligned Bounding Boxes #842

Open edmuthiah opened 3 years ago

edmuthiah commented 3 years ago

Enhancement

Hi, just wondering what it would take to incorporate rotated or quadrilateral bounding box annotations. Adding an angle attribute to the box might be a start. Happy to contribute.

Dipet commented 3 years ago

Hello! I think it is a good idea! As a first step we need to provide a flag, that this is rotated bbox. I see only two ways how to do this:

  1. Provide this bboxes as another argument with different name. For example: rotated_bboxes. In this scenario we will need to add another function to work with this type of bboxes apply_to_rotated_bbox.
  2. Expand current bboxes and add option into BbboxParams(with_angle=True). When with_angle=True we will work with bbox as bounding box in format [x1, y1, x2, y2, angle]. For example you can look how it works with keypoints formats. But in this case we need to implement some tricks to work with our current interface.

Try to look how it currently works.

ricvo commented 3 years ago

Just a thought, could it be possible to store internally the 4 points of the box? something like bbox8 format [x1, y1, x2, y2, x3, y3, x4, y4] This would lead to a more verbose parametrization but maybe this would work in both cases, rotated or not?

BloodAxe commented 3 years ago

Parametrization via [x1, y1, x2, y2, x3, y3, x4, y4] allows handing on arbitrary polygon of 4 vertices, while parametrization via x,y,w,h,angle - only for rotated rectangles, and therefore is less error-prone. Based on what we want to achieve we can select one of them. My personal take is x,y,w,h,angle for internal representation with conversion where needed.

ricvo commented 3 years ago

I see your point BloodAxe, I am not sure it would necessarily be more error prone, but I agree that it shifts the problem on finding the desired bbox after every transformation in case the points gets misaligned.

Actually I just saw that one maybe could go around all of this by specifying the keypoints.

GridDistortion did not implement keypoints movement, NotImplementedError: Method apply_to_keypoint is not implemented in class GridDistortion

But imgaug.transforms.IAAPiecewiseAffine works well for example. If bbox gets too distorted during a transformation one might even think if it makes sense thought... I think custom strategies could be implemented depending on the different transformations and this already covers at least some cases of interest

visbec commented 2 years ago

Is there any progress on this feature? This would be quite handy. ;)

RGring commented 1 year ago

+1

saschaglo commented 1 year ago

Since there are versions of yolo who can have OBB (i.e. https://github.com/hukaixuan19970627/yolov5_obb), it would be nice if someone can give this a little push.. what do you think?

hytxx commented 11 months ago

Since there are versions of yolo who can have OBB (i.e. https://github.com/hukaixuan19970627/yolov5_obb), it would be nice if someone can give this a little push.. what do you think?

ultralytics obb They haven't achieved it yet