ultralytics / JSON2YOLO

Convert JSON annotations into YOLO format.
https://docs.ultralytics.com
GNU Affero General Public License v3.0
878 stars 233 forks source link

Convert the COCO RLE format to YOLOv5/v8 segmentation format. #38

Open ryouchinsa opened 1 year ago

ryouchinsa commented 1 year ago

Hi, thanks for your useful script.

We added rle2polygon() to general_json2yolo.py so that you can convert the COCO RLE format to YOLOv5/v8 segmentation format. Please let us know your opinion. https://github.com/ryouchinsa/Rectlabel-support/blob/master/general_json2yolo.py

if use_segments:
    if len(ann['segmentation']) == 0:
        segments.append([])
        continue
    if isinstance(ann['segmentation'], dict):
        ann['segmentation'] = rle2polygon(ann['segmentation'])
    if len(ann['segmentation']) > 1:
        s = merge_multi_segment(ann['segmentation'])
        s = (np.concatenate(s, axis=0) / np.array([w, h])).reshape(-1).tolist()

def is_clockwise(contour):
    value = 0
    num = len(contour)
    for i, point in enumerate(contour):
        p1 = contour[i]
        if i < num - 1:
            p2 = contour[i + 1]
        else:
            p2 = contour[0]
        value += (p2[0][0] - p1[0][0]) * (p2[0][1] + p1[0][1]);
    return value < 0

def get_merge_point_idx(contour1, contour2):
    idx1 = 0
    idx2 = 0
    distance_min = -1
    for i, p1 in enumerate(contour1):
        for j, p2 in enumerate(contour2):
            distance = pow(p2[0][0] - p1[0][0], 2) + pow(p2[0][1] - p1[0][1], 2);
            if distance_min < 0:
                distance_min = distance
                idx1 = i
                idx2 = j
            elif distance < distance_min:
                distance_min = distance
                idx1 = i
                idx2 = j
    return idx1, idx2

def merge_contours(contour1, contour2, idx1, idx2):
    contour = []
    for i in list(range(0, idx1 + 1)):
        contour.append(contour1[i])
    for i in list(range(idx2, len(contour2))):
        contour.append(contour2[i])
    for i in list(range(0, idx2 + 1)):
        contour.append(contour2[i])
    for i in list(range(idx1, len(contour1))):
        contour.append(contour1[i])
    contour = np.array(contour)
    return contour

def merge_with_parent(contour_parent, contour):
    if not is_clockwise(contour_parent):
        contour_parent = contour_parent[::-1]
    if is_clockwise(contour):
        contour = contour[::-1]
    idx1, idx2 = get_merge_point_idx(contour_parent, contour)
    return merge_contours(contour_parent, contour, idx1, idx2)

def mask2polygon(image):
    contours, hierarchies = cv2.findContours(image, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_TC89_KCOS)
    contours_approx = []
    polygons = []
    for contour in contours:
        epsilon = 0.001 * cv2.arcLength(contour, True)
        contour_approx = cv2.approxPolyDP(contour, epsilon, True)
        contours_approx.append(contour_approx)

    contours_parent = []
    for i, contour in enumerate(contours_approx):
        parent_idx = hierarchies[0][i][3]
        if parent_idx < 0 and len(contour) >= 3:
            contours_parent.append(contour)
        else:
            contours_parent.append([])

    for i, contour in enumerate(contours_approx):
        parent_idx = hierarchies[0][i][3]
        if parent_idx >= 0 and len(contour) >= 3:
            contour_parent = contours_parent[parent_idx]
            if len(contour_parent) == 0:
                continue
            contours_parent[parent_idx] = merge_with_parent(contour_parent, contour)

    contours_parent_tmp = []
    for contour in contours_parent:
        if len(contour) == 0:
            continue
        contours_parent_tmp.append(contour)

    polygons = []
    for contour in contours_parent_tmp:
        polygon = contour.flatten().tolist()
        polygons.append(polygon)
    return polygons 

def rle2polygon(segmentation):
    if isinstance(segmentation["counts"], list):
        segmentation = mask.frPyObjects(segmentation, *segmentation["size"])
    m = mask.decode(segmentation) 
    m[m > 0] = 255
    polygons = mask2polygon(m)
    return polygons
collinmccarthy commented 1 year ago

Hi @ryouchinsa, I noticed you are approximating the contour in a different way than this answer here - https://github.com/cocodataset/cocoapi/issues/476#issuecomment-871804850

Why are you using this:

contours, _ = cv2.findContours(m, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_TC89_KCOS)
polygons = []
for contour in contours:
    epsilon = 0.001 * cv2.arcLength(contour, True)
    contour_approx = cv2.approxPolyDP(contour, epsilon, True)
    polygon = contour_approx.flatten().tolist()
    polygons.append(polygon)

instead of this, which produces significantly more polygon vertices/coordinates

contours, _ = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
polygons = []
for contour in contours:
    polygons.append(contour.astype(float).flatten().tolist())

I'm not saying your approach is wrong. I'm just curious if you chose a faster (but less accurate) method for your application, rather than a slower but more accurate method, or whether I'm misunderstanding something. Thanks.

ryouchinsa commented 1 year ago

Thanks for the detailed feedback.

If we combine findContours() and approxPolyDP(), we can decrease the number of polygon points from 500 points to 50 points, for example. When we edit the polygon, 500 points are too many, we think tens of points are appropriate. If you want to preserve the mask shape as much as possible on training, you do not have to use approxPolyDP().

pixels_to_polygons

glenn-jocher commented 1 year ago

Thanks for the detailed feedback.

We chose to use a combination of findContours() and approxPolyDP() to reduce the number of polygon points, optimizing for a decrease from 500 points to around 50. This approach balances accuracy with efficiency, ensuring a manageable number of points while retaining the essential shape of the mask.

If preserving the mask shape as closely as possible during training is a priority, it's not necessary to use approxPolyDP().

ryouchinsa commented 1 year ago

Using the script general_json2yolo.py, you can convert the RLE mask with holes to the YOLO segmentation format.

The RLE mask is converted to a parent polygon and a child polygon using cv2.findContours(). The parent polygon points are sorted in clockwise order. The child polygon points are sorted in counterclockwise order. Detect the nearest point in the parent polygon and in the child polygon. Connect those 2 points with narrow 2 lines. So that the polygon with a hole is saved in the YOLO segmentation format.

def is_clockwise(contour):
    value = 0
    num = len(contour)
    for i, point in enumerate(contour):
        p1 = contour[i]
        if i < num - 1:
            p2 = contour[i + 1]
        else:
            p2 = contour[0]
        value += (p2[0][0] - p1[0][0]) * (p2[0][1] + p1[0][1]);
    return value < 0

def get_merge_point_idx(contour1, contour2):
    idx1 = 0
    idx2 = 0
    distance_min = -1
    for i, p1 in enumerate(contour1):
        for j, p2 in enumerate(contour2):
            distance = pow(p2[0][0] - p1[0][0], 2) + pow(p2[0][1] - p1[0][1], 2);
            if distance_min < 0:
                distance_min = distance
                idx1 = i
                idx2 = j
            elif distance < distance_min:
                distance_min = distance
                idx1 = i
                idx2 = j
    return idx1, idx2

def merge_contours(contour1, contour2, idx1, idx2):
    contour = []
    for i in list(range(0, idx1 + 1)):
        contour.append(contour1[i])
    for i in list(range(idx2, len(contour2))):
        contour.append(contour2[i])
    for i in list(range(0, idx2 + 1)):
        contour.append(contour2[i])
    for i in list(range(idx1, len(contour1))):
        contour.append(contour1[i])
    contour = np.array(contour)
    return contour

def merge_with_parent(contour_parent, contour):
    if not is_clockwise(contour_parent):
        contour_parent = contour_parent[::-1]
    if is_clockwise(contour):
        contour = contour[::-1]
    idx1, idx2 = get_merge_point_idx(contour_parent, contour)
    return merge_contours(contour_parent, contour, idx1, idx2)

def mask2polygon(image):
    contours, hierarchies = cv2.findContours(image, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_TC89_KCOS)
    contours_approx = []
    polygons = []
    for contour in contours:
        epsilon = 0.001 * cv2.arcLength(contour, True)
        contour_approx = cv2.approxPolyDP(contour, epsilon, True)
        contours_approx.append(contour_approx)

    contours_parent = []
    for i, contour in enumerate(contours_approx):
        parent_idx = hierarchies[0][i][3]
        if parent_idx < 0 and len(contour) >= 3:
            contours_parent.append(contour)
        else:
            contours_parent.append([])

    for i, contour in enumerate(contours_approx):
        parent_idx = hierarchies[0][i][3]
        if parent_idx >= 0 and len(contour) >= 3:
            contour_parent = contours_parent[parent_idx]
            if len(contour_parent) == 0:
                continue
            contours_parent[parent_idx] = merge_with_parent(contour_parent, contour)

    contours_parent_tmp = []
    for contour in contours_parent:
        if len(contour) == 0:
            continue
        contours_parent_tmp.append(contour)

    polygons = []
    for contour in contours_parent_tmp:
        polygon = contour.flatten().tolist()
        polygons.append(polygon)
    return polygons 

def rle2polygon(segmentation):
    if isinstance(segmentation["counts"], list):
        segmentation = mask.frPyObjects(segmentation, *segmentation["size"])
    m = mask.decode(segmentation) 
    m[m > 0] = 255
    polygons = mask2polygon(m)
    return polygons

The RLE mask.

スクリーンショット 2023-11-22 1 57 52

The converted YOLO segmentation format.

スクリーンショット 2023-11-22 2 11 14

To run the script, put the COCO JSON file coco_train.json into datasets/coco/annotations. Run the script. python general_json2yolo.py The converted YOLO txt files are saved in new_dir/labels/coco_train.

スクリーンショット 2023-11-23 16 39 21

Edit use_segments and use_keypoints in the script.

if __name__ == '__main__':
    source = 'COCO'

    if source == 'COCO':
        convert_coco_json('../datasets/coco/annotations',  # directory with *.json
                          use_segments=True,
                          use_keypoints=False,
                          cls91to80=False)

To convert the COCO bbox format to YOLO bbox format.

use_segments=False,
use_keypoints=False,

To convert the COCO segmentation format to YOLO segmentation format.

use_segments=True,
use_keypoints=False,

To convert the COCO keypoints format to YOLO keypoints format.

use_segments=False,
use_keypoints=True,

This script originates from Ultralytics JSON2YOLO repository. We hope this script would help your business.

glenn-jocher commented 1 year ago

@ryouchinsa thanks for sharing the updated script and examples of the RLE mask and the converted YOLO segmentation format. Your efforts to enhance the functionality of the script are much appreciated. It's great to see the improvements you've made and how they translate into the YOLO segmentation format. Good job!

ryouchinsa commented 1 year ago

We updated the general_json2yolo.py script so that the RLE mask with holes are converted to YOLO segmentation format.

We believe that this script would be beneficial for your company and users. Could you review the script before making a PR?

glenn-jocher commented 1 year ago

@ryouchinsa thank you for the update and for considering our input. We appreciate your effort in enhancing the script to accommodate RLE masks with holes. We will review the script and provide feedback as soon as possible. Keep up the great work!

ryouchinsa commented 1 year ago

Thanks for reviewing our script. We checked whether YOLO can train polygon masks with holes with a small dataset.

Donut images and YOLO segmentation text files to confirm that YOLO can train polygon masks with holes.

mak-E-6fFmT1kAw-unsplash

aldrin-rachman-pradana-Ilt3lUJP-EA-unsplash

brooke-lark-5BbB3WPi128-unsplash

xandreasw-NH2S3zVPMaE-unsplash

glenn-jocher commented 1 year ago

@ryouchinsa thank you for sharing the donut images and YOLO segmentation text files. We'll take a look and confirm that the YOLO model can effectively train polygon masks with holes using this dataset. Your contribution is valuable, and we appreciate your efforts in enhancing the YOLO functionality.

ryouchinsa commented 1 year ago

Hi @glenn-jocher, I submitted the PR about this update. https://github.com/ultralytics/JSON2YOLO/pull/61

Please let us know if there are any problems in the PR.

glenn-jocher commented 1 year ago

@ryouchinsa thanks for submitting the PR. I will review it and get back to you if there are any issues. Appreciate your contribution!

Harry-KIT commented 9 months ago

Hi @ryouchinsa , my question is also similar with others. But I have labeled image like below:

{ "version": "5.4.1", "flags": {}, "shapes": [ { "label": "food", "points": [ [ 239.0, 196.0 ], [ 285.0, 297.0 ] ], "group_id": null, "description": "", "shape_type": "mask", "flags": {}, "mask": "iVBORw0KGgoAAAANSUhEUgAAAC8AAABmAQAAAABzC/WlAAAAxUlEQVR4nI2RwRGCMBBFX3YY5SYdSCfSlicpLSVYAiVw4MABEg8fhVUcPb15f7NJNoFjBIwyAnACMGiF5gAG9TUCp5w7DLSkgmq1BfXLyo8aUOyFC0wIa9gA7feGNzRCLVSuVr4s7rV3qxUhJGA2qRVCuWmwt/bq9wXrP2fAmo0lV5ucjULvwk6I2zDrBZM7aBB6YfnUKLQYMGvlJIzCHUKGAAZZmyU3w+RsdNY76z5nb58WAW55U+OSATgnhUI/ANhwB3gAGE8tVLpKku0AAAAASUVORK5CYII=" }, { "label": "food", "points": [ [ 159.0, 137.0 ], [ 177.0, 166.0 ] ], "group_id": null, "description": "", "shape_type": "mask", "flags": {}, "mask": "iVBORw0KGgoAAAANSUhEUgAAABMAAAAeAQAAAADmtRi/AAAASElEQVR4nEXKsRFAQABE0XdLESJDJ9cZpdGJEoSyEzBE+//8ZRSzqAKaFGtCejKomUglKM0eDweWdny+vXv8PaenX++/a5TTDZA+DWjnicmcAAAAAElFTkSuQmCC" }, ..... ..... "imagePath": "gray-a-0-0.jpg", "imageData": ...... "imageHeight": 640, "imageWidth": 640 }

I couldn't find any sources to convert its mask to yolo format

ryouchinsa commented 9 months ago

Hi @Harry-KIT, could you convert the Labelme format to COCO format? Then, you can convert the COCO format to YOLO format using this script general_json2yolo.py.

Harry-KIT commented 9 months ago

Hi @ryouchinsa, Than you very much

4o3F commented 8 months ago

Hi @ryouchinsa , have you tested your script on a RLE mask which "iscrowd" is 1, meaning it contains multiple objects? Currently your script gives a false output just like this one here https://github.com/ultralytics/ultralytics/issues/2090#issuecomment-1517638691 it takes all objects into one single object

ryouchinsa commented 8 months ago

Hi @4o3F, thanks for your detailed feedback.

You mean that converting RLE masks with "iscrowd": 1 to YOLO format might decrease the segmentation accuracy, correct?

But, another user told us that RLE masks with "iscrowd": 1 are necessary to convert from COCO to YOLO format.

"I am trying to convert the COCO1.0 annotation files generated in CVAT to Yolo format. The COCO json file created consists of segmentation masks in RLE format therefore 'iscrowd' variable is True across all annotations." https://github.com/ryouchinsa/Rectlabel-support/issues/241#issue-1991212851

So, We added skip_iscrowd_1 flag to the convert_coco_json() function in the general_json2yolo.py script. Please give us your feedback.

Set skip_iscrowd_1=True.

スクリーンショット 2024-03-30 1 06 06

Set skip_iscrowd_1=False.

スクリーンショット 2024-03-30 1 06 44

glenn-jocher commented 8 months ago

Hi @ryouchinsa, thanks for bringing this to our attention! 👍

Indeed, handling RLE masks with "iscrowd": 1 can be tricky as it represents multiple objects as a single mask. To accommodate this, we've introduced a skip_iscrowd_1 flag in the conversion function. This allows for flexibility depending on the user's needs.

For datasets where iscrowd is significant, and individual object segmentation is required, setting skip_iscrowd_1=True will skip these masks, avoiding the merge of multiple objects into one. However, if preserving the semantic segmentation of crowded areas without distinguishing between individual objects is desired, you might opt to set skip_iscrowd_1=False.

Each approach has its use case, depending on the goal of your model. If incorrect merging is a concern in your context, I recommend experimenting with the flag to see which setting fits your needs best.

Your feedback and further observations on this would be highly appreciated!

4o3F commented 8 months ago

Hi @4o3F, thanks for your detailed feedback.

You mean that converting RLE masks with "iscrowd": 1 to YOLO format might decrease the segmentation accuracy, correct?

But, another user told us that RLE masks with "iscrowd": 1 are necessary to convert from COCO to YOLO format.

"I am trying to convert the COCO1.0 annotation files generated in CVAT to Yolo format. The COCO json file created consists of segmentation masks in RLE format therefore 'iscrowd' variable is True across all annotations." ryouchinsa/Rectlabel-support#241 (comment)

So, We added skip_iscrowd_1 flag to the convert_coco_json() function in the general_json2yolo.py script. Please give us your feedback.

Set skip_iscrowd_1=True.

スクリーンショット 2024-03-30 1 06 06

Set skip_iscrowd_1=False.

スクリーンショット 2024-03-30 1 06 44

Thanks! This indeed fixed the problem

glenn-jocher commented 8 months ago

Hi @4o3F, I’m thrilled to hear that! 🎉 Your feedback has been incredibly helpful in refining our approach to handling iscrowd flags for RLE masks. Don't hesitate to reach out if you have more insights or further questions. Cheers to improving together!

NicDionne commented 1 month ago

Hi @glenn-jocher,

Indeed, handling RLE masks with "iscrowd": 1 can be tricky as it represents multiple objects as a single mask. To accommodate this, we've introduced a skip_iscrowd_1 flag in the conversion function. This allows for flexibility depending on the user's needs.

When did you add this to the JSON2YOLO repo ?