facebookresearch / detectron2

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
https://detectron2.readthedocs.io/en/latest/
Apache License 2.0
30.48k stars 7.48k forks source link

handle annotations with inner holes #4196

Open amida47 opened 2 years ago

amida47 commented 2 years ago

🚀 Feature

I think you need to modify the way polygons are converted into bitmasks, I suppose the function responsible of this conversion is polygons_to_bitmask in detectron2/structures/masks.py because it doesn't take into account inner holes of an object

Motivation & Examples

for example this is the original mask image

after turning this object into polygons and feeding it through that function, this is the reconstructed mask

image

the solution I have found (up to you to decide if it's good) is to delete the

line 35   rle = mask_util.merge(rles)

and replace it with

if len(rles) >2:

        reduced=np.add.reduce(mask_util.decode(rles),axis=2)
        rle = np.where(reduced>=2,0,reduced).astype(np.bool)

else:

        rle = mask_util.merge(rles)
        rle = mask_util.decode(rle).astype(np.bool)

return rle

my idea is we check if there is multiple masks, add the masks into one mask, overlays between masks will be expressed as value greater than 2, so we set these values to zero

MjdMahasneh commented 1 year ago

Hi, I have been facing the same issue as yours, have you tested this solution at all?

I am puzzled on why would they not account for this case, you would think it should be a typical scenario that’s accounted for.

I realized today after hours of debugging that regardless what you pass as an input (I,.e., polygons, RLEs, or binary masks, it will all be converted to RLE first, which is the root of the issue since the conversion does not account for complex structures with multiple holes and polygons properly)..

documentation on this particular case is very limited to the best of knowledge, relevant issues don’t provide much info either.

I found this comment in response to one of the issues below:

“You can represent masks with holes by union of multiple polygons, or you can use BitMasks as inputs to the model directly.“

and I have been trying to achieve that effect with no luck!! Only to realize that all inputs will get converted to RLE representation which is the root cause of the issue so regardless how you handle the input if the conversion is input->RLE->binary then it will be affected by the same semantic issue since the conversion doesn’t account for holes properly.

Related issues:

https://github.com/facebookresearch/detectron2/issues/489

https://github.com/facebookresearch/detectron2/issues/5042

MjdMahasneh commented 1 year ago

I have visualized the effect of the suggested modification and it seems to work fine, but am not sure where to apply the changes to make sure this behavior is consistent across training/testing and visualization. Detectron2's visualizer uses pycocotools backend (e.g., mask_util) for mask conversions I guess and this should also be considered when adapting the new approach to ensure consistency of visualized masks and training/testing masks.

MjdMahasneh commented 1 year ago

@ppwwyyxx any thoughts on this, please?

donghyeon commented 11 months ago

@MjdMahasneh

Here's my implementation. XOR over all polygons and holes can make the mask easily.

# Encode all polygons with holes to RLEs
rles = mask_util.frPyObjects(polygons_with_holes, height, width)
# Decode the RLEs to boolean masks
masks = mask_util.decode(rles).astype(bool)
# XOR sum over all mask and holes (This will make mask filter out all holes)
# Note that all holes should be located inside the mask, and should not overlap each other
mask = (masks.sum(axis=2) % 2).astype(bool)
# Encode the mask to a RLE
rle = mask_util.encode(mask)
MjdMahasneh commented 11 months ago

@donghyeon thank you for sharing!! what file/s did you update for this behavior to take effect? I struggled with locating parts to change and keep behavior consistent across training/validation/testing..

Many thanks for your input :)

ryouchinsa commented 11 months ago

Using the script general_json2yolo.py, you can convert the RLE mask with holes to the YOLO segmentation format.

The RLE mask is converted to a parent polygon and a child polygon using cv2.findContours(). The parent polygon points are sorted in clockwise order. The child polygon points are sorted in counterclockwise order. Detect the nearest point in the parent polygon and in the child polygon. Connect those 2 points with narrow 2 lines. So that the polygon with a hole is saved in the YOLO segmentation format.

def is_clockwise(contour):
    value = 0
    num = len(contour)
    for i, point in enumerate(contour):
        p1 = contour[i]
        if i < num - 1:
            p2 = contour[i + 1]
        else:
            p2 = contour[0]
        value += (p2[0][0] - p1[0][0]) * (p2[0][1] + p1[0][1]);
    return value < 0

def get_merge_point_idx(contour1, contour2):
    idx1 = 0
    idx2 = 0
    distance_min = -1
    for i, p1 in enumerate(contour1):
        for j, p2 in enumerate(contour2):
            distance = pow(p2[0][0] - p1[0][0], 2) + pow(p2[0][1] - p1[0][1], 2);
            if distance_min < 0:
                distance_min = distance
                idx1 = i
                idx2 = j
            elif distance < distance_min:
                distance_min = distance
                idx1 = i
                idx2 = j
    return idx1, idx2

def merge_contours(contour1, contour2, idx1, idx2):
    contour = []
    for i in list(range(0, idx1 + 1)):
        contour.append(contour1[i])
    for i in list(range(idx2, len(contour2))):
        contour.append(contour2[i])
    for i in list(range(0, idx2 + 1)):
        contour.append(contour2[i])
    for i in list(range(idx1, len(contour1))):
        contour.append(contour1[i])
    contour = np.array(contour)
    return contour

def merge_with_parent(contour_parent, contour):
    if not is_clockwise(contour_parent):
        contour_parent = contour_parent[::-1]
    if is_clockwise(contour):
        contour = contour[::-1]
    idx1, idx2 = get_merge_point_idx(contour_parent, contour)
    return merge_contours(contour_parent, contour, idx1, idx2)

def mask2polygon(image):
    contours, hierarchies = cv2.findContours(image, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_TC89_KCOS)
    contours_approx = []
    polygons = []
    for contour in contours:
        epsilon = 0.001 * cv2.arcLength(contour, True)
        contour_approx = cv2.approxPolyDP(contour, epsilon, True)
        contours_approx.append(contour_approx)

    contours_parent = []
    for i, contour in enumerate(contours_approx):
        parent_idx = hierarchies[0][i][3]
        if parent_idx < 0 and len(contour) >= 3:
            contours_parent.append(contour)
        else:
            contours_parent.append([])

    for i, contour in enumerate(contours_approx):
        parent_idx = hierarchies[0][i][3]
        if parent_idx >= 0 and len(contour) >= 3:
            contour_parent = contours_parent[parent_idx]
            if len(contour_parent) == 0:
                continue
            contours_parent[parent_idx] = merge_with_parent(contour_parent, contour)

    contours_parent_tmp = []
    for contour in contours_parent:
        if len(contour) == 0:
            continue
        contours_parent_tmp.append(contour)

    polygons = []
    for contour in contours_parent_tmp:
        polygon = contour.flatten().tolist()
        polygons.append(polygon)
    return polygons 

def rle2polygon(segmentation):
    if isinstance(segmentation["counts"], list):
        segmentation = mask.frPyObjects(segmentation, *segmentation["size"])
    m = mask.decode(segmentation) 
    m[m > 0] = 255
    polygons = mask2polygon(m)
    return polygons

The RLE mask.

スクリーンショット 2023-11-22 1 57 52

The converted YOLO segmentation format.

スクリーンショット 2023-11-22 2 11 14

To run the script, put the COCO JSON file coco_train.json into datasets/coco/annotations. Run the script. python general_json2yolo.py The converted YOLO txt files are saved in new_dir/labels/coco_train.

スクリーンショット 2023-11-23 16 39 21

Edit use_segments and use_keypoints in the script.

if __name__ == '__main__':
    source = 'COCO'

    if source == 'COCO':
        convert_coco_json('../datasets/coco/annotations',  # directory with *.json
                          use_segments=True,
                          use_keypoints=False,
                          cls91to80=False)

To convert the COCO bbox format to YOLO bbox format.

use_segments=False,
use_keypoints=False,

To convert the COCO segmentation format to YOLO segmentation format.

use_segments=True,
use_keypoints=False,

To convert the COCO keypoints format to YOLO keypoints format.

use_segments=False,
use_keypoints=True,

This script originates from Ultralytics JSON2YOLO repository. We hope this script would help your business.