albumentations-team / albumentations

Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
https://albumentations.ai
MIT License
14.2k stars 1.65k forks source link

Incorrect number of masks returned after augmentation #836

Open ZFTurbo opened 3 years ago

ZFTurbo commented 3 years ago

🐛 Bug

Describe the bug Incorrect number of masks returned after augmentation. It happend when some boxes goes outside of image. Number of boxes changed but masks stays the same.

To Reproduce

from albumentations import *
from albumentations import __version__ as ver

transform_generator = Compose([
        ShiftScaleRotate(p=1.0, shift_limit=0.1, scale_limit=0.2, rotate_limit=45, border_mode=cv2.BORDER_REFLECT),
    ], bbox_params={'format': 'pascal_voc',
                    'min_area': 0,
                    'min_visibility': 0.0,
                    'label_fields': ['labels']}, p=1.0)

def albu_bug():
    print('Albu ver: {}'.format(ver))
    image = np.zeros((500, 500, 3), dtype=np.uint8)
    mask1 = np.zeros((500, 500, 3), dtype=np.uint8)
    mask2 = np.zeros((500, 500, 3), dtype=np.uint8)
    mask3 = np.zeros((500, 500, 3), dtype=np.uint8)
    mask1[0:10, 0:10] = 255
    mask2[-10:, -10:] = 255
    mask3[250:260, 250:260] = 255

    ann = dict()
    ann['image'] = image.copy()
    ann['masks'] = [mask1, mask2, mask3]
    ann['labels'] = np.array([0, 0, 0])
    ann['bboxes'] = np.array([[0, 0, 10, 10], [490, 490, 500, 500], [250, 250, 260, 260]])

    for i in range(10):
        augm = transform_generator(**ann)
        print(len(ann['masks']), len(ann['labels']), len(ann['bboxes']))
        print(len(augm['masks']), len(augm['labels']), len(augm['bboxes']))

if __name__ == '__main__':
    albu_bug()

Expected behavior Masks corrsponding to removed boxes must be also removed

Environment

Dipet commented 3 years ago

I am a little bit confused. Do you mean that if the whole mask equal to 0 the library should remove it?

ZFTurbo commented 3 years ago

Let's say we have instance segmentation problem. You have 20 boxes and 20 masks. After rotation box 7 and box 18 removed. I expected mask 7 and mask 18 also removed. I'm not sure how to do it in current code to stay consistent between masks and boxes.

Dipet commented 3 years ago

Normally library does not change order of arguments. Try do this:

masks = [i for i in masks if np.any(i)]
ZFTurbo commented 3 years ago

There can also be problem if box remains, but mask became zero. )

Curently I avoid this problem by recalculate boxes based on masks after augmentation, but it's rather expencieve...

Dipet commented 3 years ago

For me it is a little bit strange scenario when you need to remove masks. Because if you work with segmentation problem in common case network want masks for each class, so you need to save masks for each class.

Unfortunately, right now library does not have ability to bind masks and bboxes. Did you find information in the documentation that it works the way you describe it?

ZFTurbo commented 3 years ago

In instance segmentation each mask connected to some bbox (MaskRCNN for example). I'm not sure if albumentation supported instance segmentation. I just wanted to use albumentation for this scenario and find this problem....

Dipet commented 3 years ago

There can also be problem if box remains, but mask became zero

But if mask 0 and you have a bbox something is wrong.

Curently I avoid this problem by recalculate boxes based on masks after augmentation, but it's rather expencieve...

Hmm. You have rotated bbox or why you think that it is expansive? I think that check mask after each transform much more expensive that annotate bbox by mask. If you have only one bbox for each mask it is very simple to get bbox:

row = np.where(np.any(arr, axis=0))[0]
col = np.where(np.any(arr, axis=1))[0]

x_min, y_min, x_max, y_max = row[0], col[0], row[-1], col[-1]
ZFTurbo commented 3 years ago

It's possible for mask inside bbox to disappear while bbox is still exists. It will be rare case but still possible. Check image:

https://www.dropbox.com/s/l365ell59u2qfo2/BBox.png?dl=0

Yes I use similar code:

def bbox_for_mask(img):
    rows = np.any(img, axis=1)
    cols = np.any(img, axis=0)
    rmin, rmax = np.where(rows)[0][[0, -1]]
    cmin, cmax = np.where(cols)[0][[0, -1]]
    return cmin, rmin, cmax + 1, rmax + 1
Dipet commented 3 years ago

So, if you have bbox like on example image it is will be incorrect after crop. For this reason I think better augment only masks, because they are much more correct. So if you will augment only mask and then create bboxes it is better than create some strange code to augment both bboxes and mask and check them after each transformation.

ChrisDAT20 commented 2 years ago

I face the same issue as @ZFTurbo: I have situations where the masks do not match the bboxes. The image has 25 annotations, bboxes and segmentations which are converted to masks using the COCO api (coco.annToMask(). Then the image and targets are passed through a few transforms, CenterCrop, Resize, Affine... The transformed dict only contains 9 boxes, but it still containes 25 masks. When I filter out masks which only contain zeros masks = [i for i in masks if i.any()], most of the time this works. But sometimes a bbox is still there where its mask is gone after I filtered "empty" masks.

...than create some strange code to augment both bboxes and mask and check them after each transformation.

I thought thats what DualTransforms for, applying the transforms to both image and target in the same way. But it seems that bbox and mask transforms might get inconsistent.

jveitchmichaelis commented 1 year ago

Is there a recommended solution here for instance segmentation?

A standard approach for model inputs is a list of bounding boxes and full image masks for each object. I don't think this is strange, it's a different input expectation than either object detection or semantic segmentation. In this case the individual mask is explicitly tied to the bounding box.

@Dipet Is it possible to disable bounding box/object filtering? Then we could do a coordinate check on the returned list (e.g. coordinate < 0 or > image size) and ignore masks where that's true. This would enable relatively cheap filtering versus recomputing the bounding box for the new mask.

Dipet commented 1 year ago

Try check_each_transform=False for BboxParams

jveitchmichaelis commented 1 year ago

This doesn't seem to work, I still get a reduced set of bounding boxes out.

e.g.

transform = A.Compose([
    A.RandomCrop(width=1024, height=1024)
], bbox_params=A.BboxParams(format='pascal_voc', label_fields=['labels'], check_each_transform=False))
# succeeds
res = transform(image=image, masks=masks, bboxes=boxes, labels=labels)

# fails
assert len(res['masks']) == len(res['bboxes'])

The check is here(?)

https://github.com/albumentations-team/albumentations/blob/87b1b7d009bcff12d9cc7a482c14cac0b1300ac8/albumentations/core/composition.py#L212-L213

which calls filter:

https://github.com/albumentations-team/albumentations/blob/87b1b7d009bcff12d9cc7a482c14cac0b1300ac8/albumentations/core/composition.py#L229

However, I think filtering is actually performed again in postprocessing, which is always called: https://github.com/albumentations-team/albumentations/blob/87b1b7d009bcff12d9cc7a482c14cac0b1300ac8/albumentations/core/composition.py#L216-L219

https://github.com/albumentations-team/albumentations/blob/87b1b7d009bcff12d9cc7a482c14cac0b1300ac8/albumentations/core/utils.py#L68-L76

tobiasvanderwerff commented 1 year ago

I agree with @jveitchmichaelis that being able to disable box filtering or being able to filter masks along with the bounding boxes would be a nice feature to have.

james-imi commented 8 months ago

Opening this up.this code

    bbox_params=A.BboxParams(format="pascal_voc", min_visibility=0.3),

Can produce mask without a box image

and even produces mismatched outputs

the bounding boxes has 1 output while the mask has 6, and not even properly ordered.