How to convert from COCO instance segmentation format to YOLOv5 instance segmentation Without Roboflow?

ichsan2895 commented 1 year ago

Search before asking

[X] I have searched the YOLOv5 issues and discussions and found no similar questions.

Question

Hello, is it possible to convert COCO instance segmentation Custom dataset to YOLOv5 instance segmentation dataset (without Roboflow ) or maybe creating from scratch?

I already check this tutorial Train On Custom Data 1st and this tutorial Format of YOLO annotations

Most of tutorial just tell format for BBOX and doesn't tell how to convert COCO to YOLO

But I don't find any tutorial for converting COCO to YOLOv5 without Roboflow

Can somebody help me?

Thanks for sharing

Additional

No response

github-actions[bot] commented 1 year ago

👋 Hello @ichsan2895, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://ultralytics.com or email support@ultralytics.com.

Requirements

Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Notebooks with free GPU:
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Amazon Deep Learning AMI. See AWS Quickstart Guide
Docker Image. See Docker Quickstart Guide

Status

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

ExtReMLapin commented 1 year ago

You can code it yourself in python, just keep in mind on coco, origin is top left, in yolo it's center

ryouchinsa commented 1 year ago

Using this script, you can convert the COCO segmentation format to the YOLO segmentation format. https://github.com/ultralytics/JSON2YOLO

RectLabel is an offline image annotation tool for object detection and segmentation. Although this is not an open source program, with RectLabel you can import the COCO segmentation format and export to the YOLO segmentation format.

class_index x1 y1 x2 y2 x3 y3 ...
0 0.180027 0.287930 0.181324 0.280698 0.183726 0.270573 ...

yolo_polygon

ichsan2895 commented 1 year ago

Using this script, you can convert the COCO segmentation format to the YOLO segmentation format. https://github.com/ultralytics/JSON2YOLO

RectLabel is an offline image annotation tool for object detection and segmentation. Although this is not an open source program, with RectLabel you can import the COCO segmentation format and export to the YOLO segmentation format. https://rectlabel.com/help#xml_to_yolo
class_index x1 y1 x2 y2 x3 y3 ...
0 0.180027 0.287930 0.181324 0.280698 0.183726 0.270573 ...

Thanks I will check out JSON2YOLO script, I will report back if I see any trouble

Edohvin commented 1 year ago

Did you see any trouble? @ichsan2895

ichsan2895 commented 1 year ago

Did you see any trouble? @ichsan2895

Sorry for slow respon

Yes the JSON2COCO (https://github.com/ultralytics/JSON2YOLO) is failed to work in my computer.

The log seems success, but the label/annot txt was not found. I'm not sure what happen.

The COCO was made by labelme annotator. So, it has directory config as :

COCO_Project
|-> JPEGImages\
     |-> img_01.jpg
     |-> img_02.jpg
|-> annotations.txt

Fortunatelly, After One week of debugging. I created a jupyter notebook that mix the code from JSON2COCO & stackoverflow to converting COCO to YOLO and you can download it here : https://drive.google.com/file/d/1xhBiWv_Y0HBZQHoWBwF7yjpRrDZhrk4f/view?usp=sharing

Just change the last cell to desired output_path and json_file path. If you want use bbox annotations, just add use_segment=False. Then you run all cells from start to the last.

# the annot will be formatted as bbox, ideal for object detection task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=False)

# the annot will be formatted as polygon, ideal for instance segmentation task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=True)

iagorrr commented 1 year ago

Did you see any trouble? @ichsan2895

Sorry for slow respon

Yes the JSON2COCO (https://github.com/ultralytics/JSON2YOLO) is failed to work in my computer.

The log seems success, but the label/annot txt was not found. I'm not sure what happen.

The COCO was made by labelme annotator. So, it has directory config as :
COCO_Project
|-> JPEGImages\
     |-> img_01.jpg
     |-> img_02.jpg
|-> annotations.txt
Fortunatelly, After One week of debugging. I created a jupyter notebook that mix the code from JSON2COCO & stackoverflow to converting COCO to YOLO and you can download it here : https://drive.google.com/file/d/1xhBiWv_Y0HBZQHoWBwF7yjpRrDZhrk4f/view?usp=sharing

Just change the last cell to desired output_path and json_file path. If you want use bbox annotations, just add use_segment=False. Then you run all cells from start to the last.
# the annot will be formatted as bbox, ideal for object detection task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=False)

# the annot will be formatted as polygon, ideal for instance segmentation task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=True)

The python notebook worked perfectly for me, thank you !

kadirnar commented 1 year ago

Did you see any trouble? @ichsan2895

Sorry for slow respon Yes the JSON2COCO (https://github.com/ultralytics/JSON2YOLO) is failed to work in my computer. The log seems success, but the label/annot txt was not found. I'm not sure what happen. The COCO was made by labelme annotator. So, it has directory config as :
COCO_Project
|-> JPEGImages\
     |-> img_01.jpg
     |-> img_02.jpg
|-> annotations.txt
Fortunatelly, After One week of debugging. I created a jupyter notebook that mix the code from JSON2COCO & stackoverflow to converting COCO to YOLO and you can download it here : https://drive.google.com/file/d/1xhBiWv_Y0HBZQHoWBwF7yjpRrDZhrk4f/view?usp=sharing Just change the last cell to desired output_path and json_file path. If you want use bbox annotations, just add use_segment=False. Then you run all cells from start to the last.
# the annot will be formatted as bbox, ideal for object detection task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=False)

# the annot will be formatted as polygon, ideal for instance segmentation task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=True)
The python notebook worked perfectly for me, thank you !

Can you share sample coco json file? This code didn't work.

---> 85     line = *(segments[last_iter] if use_segments else bboxes[last_iter]),  # cls, box or segments
     86     f.write(('%g ' * len(line)).rstrip() % line + '\n')
     87 print("that images contains class:",len(bboxes),"objects")

IndexError: list index out of range

ichsan2895 commented 1 year ago

Did you see any trouble? @ichsan2895

Sorry for slow respon Yes the JSON2COCO (https://github.com/ultralytics/JSON2YOLO) is failed to work in my computer. The log seems success, but the label/annot txt was not found. I'm not sure what happen. The COCO was made by labelme annotator. So, it has directory config as :
COCO_Project
|-> JPEGImages\
     |-> img_01.jpg
     |-> img_02.jpg
|-> annotations.txt
Fortunatelly, After One week of debugging. I created a jupyter notebook that mix the code from JSON2COCO & stackoverflow to converting COCO to YOLO and you can download it here : https://drive.google.com/file/d/1xhBiWv_Y0HBZQHoWBwF7yjpRrDZhrk4f/view?usp=sharing Just change the last cell to desired output_path and json_file path. If you want use bbox annotations, just add use_segment=False. Then you run all cells from start to the last.
# the annot will be formatted as bbox, ideal for object detection task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=False)

# the annot will be formatted as polygon, ideal for instance segmentation task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=True)
The python notebook worked perfectly for me, thank you !
Can you share sample coco json file? This code didn't work.
---> 85     line = *(segments[last_iter] if use_segments else bboxes[last_iter]),  # cls, box or segments
     86     f.write(('%g ' * len(line)).rstrip() % line + '\n')
     87 print("that images contains class:",len(bboxes),"objects")

IndexError: list index out of range

Sure, I have make this sample coco json with labelme library.

Please take a look.. typical coco json to yolo segmentation.zip

kadirnar commented 1 year ago

Why are there negative values?

ichsan2895 commented 1 year ago

Why are there negative values?

Can you share the entire your coco dataset (images + coco_annot.json)? Since it never happen to me (negative value). For the good result, it recomended to create coco annotation with labelme, then convert it with my notebook to yolo format

almazgimaev commented 1 year ago

Hi, @ichsan2895, You can do it in just a couple of clicks using apps from Supervisely ecosystem:

Firstly, you need to upload your COCO format data to the Supervisely using Import COCO app. You can also upload data in another format one of the import applications
Next, you can export the data from Supervisely in the YOLO v5/v8 format:
- For polygons and masks (without internal cutouts), use the "Export to YOLOv8" app;
```
class x1 y1 x2 y2 x3 y3 ...
0 0.100417 0.654604 0.089646 0.662646 0.087561 0.666667 ...
```
- For exporting as bounding boxes, use the "Convert Supervisely to YOLO v5 format" app. But if not all labels in your project are rectangles, use this app to convert them to rectangles).
```
class x_center y_center width height
0 0.16713 0.787696 0.207783 0.287495
```

I'm sure there are many apps in the Supervisely ecosystem that can help solve your tasks.

github-actions[bot] commented 1 year ago

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Docs: https://docs.ultralytics.com
HUB: https://hub.ultralytics.com
Community: https://community.ultralytics.com

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

SkalskiP commented 1 year ago

Hi 👋🏻 I'm probably late to the party, but you can convert between formats with supervision.

import supervision as sv

sv.DetectionDataset.from_coco(
    images_directory_path='...',
    annotations_path='...',
    force_masks=True
).as_yolo(
    images_directory_path='...',
    annotations_directory_path='...',
    data_yaml_path='...'
)

glenn-jocher commented 1 year ago

@SkalskiP thanks for sharing your solution! We appreciate your input and contribution to the YOLOv5 community. Your code snippet using supervision library seems like a handy tool for converting between formats. It's great to see different approaches to tackle the problem. Keep up the good work!

lonngxiang commented 10 months ago

Hi 👋🏻 I'm probably late to the party, but you can convert between formats with supervision.

import supervision as sv

sv.DetectionDataset.from_coco(
    images_directory_path='...',
    annotations_path='...',
    force_masks=True
).as_yolo(
    images_directory_path='...',
    annotations_directory_path='...',
    data_yaml_path='...'
)

is segment support?

glenn-jocher commented 10 months ago

@lonngxiang yes, the supervision utility provides support for converting segmentation annotations in addition to detection annotations. You can use the force_masks=True argument in the from_coco method to ensure that masks are enforced during the conversion process. This enables seamless conversion between different annotation formats, allowing you to work with segmentation annotations as well.

ryouchinsa commented 10 months ago

We updated our general_json2yolo.py script so that the RLE mask with holes can be converted to the YOLO segmentation format correctly. https://github.com/ultralytics/ultralytics/issues/917#issuecomment-1821375321

glenn-jocher commented 10 months ago

@ryouchinsa thank you for sharing the update to the general_json2yolo.py script! 💻 It's great to see the community working together to improve the conversion process for RLE masks with holes to the YOLO segmentation format. Your contribution will definitely benefit others who are facing similar challenges. Keep up the great work! If you have any further improvements or insights, feel free to share them.

lonngxiang commented 10 months ago

@lonngxiang yes, the supervision utility provides support for converting segmentation annotations in addition to detection annotations. You can use the force_masks=True argument in the from_coco method to ensure that masks are enforced during the conversion process. This enables seamless conversion between different annotation formats, allowing you to work with segmentation annotations as well.

tks, i try use this datasets with 1 label https://universe.roboflow.com/naumov-igor-segmentation/car-segmetarion：

but i use this script coco2yolo，but i got 2 labeles,i donnot know why

import supervision as sv

sv.DetectionDataset.from_coco(
    images_directory_path= r"C:\Users\loong\Downloads\Car\valid",
    annotations_path=r"C:\Users\loong\Downloads\Car\valid\_annotations.coco.json",
    force_masks=True
).as_yolo(
    images_directory_path=r"C:\Users\loong\Downloads\Car_yolo\val\images",
    annotations_directory_path=r"C:\Users\loong\Downloads\Car_yolo\val\labels",
    data_yaml_path=r"C:\Users\loong\Downloads\Car_yolo\data.yaml"
)

glenn-jocher commented 10 months ago

@lonngxiang it looks like the issue might be related to the conversion process. One possibility is that the COCO dataset includes multiple categories, leading to the creation of multiple labels during the conversion. You may want to review the original COCO annotations and ensure that only the desired category (in this case, "car") is included in the annotations. Double-checking the original COCO annotations to ensure that only the "car" category is present could help resolve the issue.

Additionally, you might want to inspect the annotations.coco.json file to confirm the structure and contents of the annotations. This can help identify any unexpected data that might be causing the extra labels to appear during the conversion process.

Feel free to reach out if you have further questions or need additional assistance!

lonngxiang commented 10 months ago

it looks like the issue might be related to the conversion process. One possibility is that the COCO dataset includes multiple categories, leading to the creation of multiple labels during the conversion. You may want to review the original COCO annotations and ensure that only the desired category (in this case, "car") is included in the annotations. Double-checking the original COCO annotations to ensure that only the "car" category is present could help resolve the issue.

Additionally, you might want to inspect the annotations.coco.json file to confirm the structure and contents of the annotations. This can help identify any unexpected data that might be causing the extra labels to appear during the conversion process.

Feel free to reach out if you have further questions or need additional assistance!

tks; but how to use this script if i use this download datasets https://universe.roboflow.com/naumov-igor-segmentation/car-segmetarion

glenn-jocher commented 10 months ago

@lonngxiang i understand your question, but as an open-source contributor, I am unable to guide you on using specific third-party datasets, such as the one from Roboflow, as I am not associated with them. I recommend referencing the documentation or support resources provided by Roboflow for guidance on using their datasets with the supervision library for conversion. If you encounter any specific issues related to YOLOv5 or general conversion processes, I am here to assist. Additionally, feel free to consult the YOLOv5 documentation for further insights on dataset conversion.

lonngxiang commented 10 months ago

@lonngxiang i understand your question, but as an open-source contributor, I am unable to guide you on using specific third-party datasets, such as the one from Roboflow, as I am not associated with them. I recommend referencing the documentation or support resources provided by Roboflow for guidance on using their datasets with the supervision library for conversion. If you encounter any specific issues related to YOLOv5 or general conversion processes, I am here to assist. Additionally, feel free to consult the YOLOv5 documentation for further insights on dataset conversion.

tks, I have utilized the Supervision library to convert the COCO segmentation format to the YOLO format. However, when I ran the Ultralytics command, the results were not as expected .

yolo segment train model=yolov8m-seg.yaml data=/mnt/data/loong/segmetarion/Car_yolo/data.yaml epochs=100

SkalskiP commented 10 months ago

Hi @lonngxiang 👋🏻, I'm the creator of Supervision. Have you been able to solve your conversion problem?

lonngxiang commented 10 months ago

Hi @lonngxiang 👋🏻, I'm the creator of Supervision. Have you been able to solve your conversion problem?

yes,but use sv.DetectionDataset.from_coco().as_yolo() not work for yolo segment format；finally i fixed by used this methods https://github.com/ultralytics/JSON2YOLO/blob/master/general_json2yolo.py

glenn-jocher commented 10 months ago

@lonngxiang it’s great to hear that you found a solution! If you have any other questions or encounter more issues in the future, feel free to ask. We’re here to help. Good luck with your project!

ryouchinsa commented 9 months ago

Hi @SkalskiP, Can supervision convert multiple polygons in the COCO format to YOLO segmentation format?

"annotations": [
{
    "area": 594425,
    "bbox": [328, 834, 780, 2250],
    "category_id": 1,
    "id": 1,
    "image_id": 1,
    "iscrowd": 0,
    "segmentation": [
        [495, 987, 497, 984, 501, 983, 500, 978, 498, 962, 503, 937, 503, 926, 532, 877, 569, 849, 620, 834, 701, 838, 767, 860, 790, 931, 803, 963, 802, 972, 846, 970, 896, 969, 896, 977, 875, 982, 847, 984, 793, 987, 791, 1001, 783, 1009, 785, 1022, 791, 1024, 787, 1027, 795, 1041, 804, 1059, 811, 1072, 810, 1081, 800, 1089, 788, 1092, 783, 1098, 784, 1115, 780, 1120, 774, 1123, 778, 1126, 778, 1136, 775, 1140, 767, 1140, 763, 1146, 767, 1164, 754, 1181, 759, 1212, 751, 1264, 815, 1283, 839, 1303, 865, 1362, 880, 1442, 902, 1525, 930, 1602, 953, 1640, 996, 1699, 1021, 1773, 1039, 1863, 1060, 1920, 1073, 1963, 1089, 1982, 1102, 2013, 1107, 2037, 1107, 2043, 1099, 2046, 1097, 2094, 1089, 2123, 1074, 2137, 1066, 2153, 1033, 2172, 1024, 2166, 1024, 2166, 1023, 2129, 1019, 2093, 1004, 2057, 996, 2016, 1000, 1979, 903, 1814, 860, 1727, 820, 1647, 772, 1547, 695, 1637, 625, 1736, 556, 1854, 495, 1986, 459, 2110, 446, 1998, 449, 1913, 401, 1819, 362, 1720, 342, 1575, 328, 1440, 335, 1382, 348, 1330, 366, 1294, 422, 1248, 437, 1222, 450, 1190, 466, 1147, 482, 1107, 495, 1076, 506, 1019, 497, 1016],
        [878, 2293, 868, 2335, 855, 2372, 843, 2413, 838, 2445, 820, 2497, 806, 2556, 805, 2589, 809, 2622, 810, 2663, 807, 2704, 793, 2785, 772, 2866, 742, 2956, 725, 3000, 724, 3013, 740, 3024, 757, 3029, 778, 3033, 795, 3033, 812, 3032, 812, 3046, 803, 3052, 791, 3063, 771, 3069, 745, 3070, 733, 3074, 719, 3077, 702, 3075, 680, 3083, 664, 3082, 631, 3072, 601, 3061, 558, 3058, 553, 3039, 558, 3023, 566, 3001, 568, 2983, 566, 2960, 572, 2912, 571, 2859, 567, 2781, 572, 2698, 576, 2643, 583, 2613, 604, 2568, 628, 2527, 637, 2500, 636, 2468, 629, 2445, 621, 2423, 673, 2409, 726, 2388, 807, 2344, 878, 2293]
    ]
}],

YoungjaeDev commented 7 months ago

@ryouchinsa I analyzed the json2yolo conversion code and am curious about the principle. Why do ti divide by k=0, k=1? In the end, the first and last points of each unit polygon are the same, so we can check which instance it is, but can you tell why we need to do forward and backward for a unit polygon?

for k in range(2):
        # forward connection
        if k == 0:
            # idx_list: [[5], [12, 0], [7]]
            for i, idx in enumerate(idx_list):
                # middle segments have two indexes
                # reverse the index of middle segments
                # 첫번째와 마지막 세그먼트를 제외한 나머지 세그먼트들은 두개의 인덱스를 가지고 있음
                # idx_list = [ [p], [p, q], [p, q], ... , [q]]
                if len(idx) == 2 and idx[0] > idx[1]:
                    idx = idx[::-1]
                    # segments[i] : (N, 2)
                    segments[i] = segments[i][::-1, :]

                segments[i] = np.roll(segments[i], -idx[0], axis=0)
                segments[i] = np.concatenate([segments[i], segments[i][:1]])
                # deal with the first segment and the last one
                if i in [0, len(idx_list) - 1]:
                    s.append(segments[i])
                else:
                    idx = [0, idx[1] - idx[0]]
                    s.append(segments[i][idx[0] : idx[1] + 1])

        else:
            for i in range(len(idx_list) - 1, -1, -1):
                if i not in [0, len(idx_list) - 1]:
                    idx = idx_list[i]
                    nidx = abs(idx[1] - idx[0])
                    s.append(segments[i][nidx:])
    return s

YoungjaeDev commented 7 months ago

@ryouchinsa

Well, in the end, it depends on how the train code parses it, but I'm curious about whether that method is efficient.

ryouchinsa commented 7 months ago

Hi @youngjae-avikus,

For example, we are going to merge 3 polygons into a polygon. To connect 3 polygons with narrow lines with width 0, there are both forward and backward scans to append all points.

k == 0: // forward Append [0, 1, 2, 0] of left polygon. Append [0, 1, 2] of center polygon. Append [0, 1, 2, 3, 4, 0] of right polygon.

k == 1: // backward Append [2, 3, 0] of center polygon.

If you have any questions, please let us know.

スクリーンショット 2024-02-20 5 29 11

glenn-jocher commented 6 months ago

@ryouchinsa looks correct to me

ryouchinsa commented 6 months ago

@glenn-jocher, thanks for reviewing my explanation.

glenn-jocher commented 6 months ago

@ryouchinsa you're welcome! If you have any more questions or need further assistance, feel free to reach out. Happy to help!

ultralytics / yolov5