ultralytics / ultralytics

Ultralytics YOLO11 🚀
https://docs.ultralytics.com
GNU Affero General Public License v3.0
29.57k stars 5.79k forks source link

yolov8_with SAM for auto segmentation task #2075

Closed akashAD98 closed 1 year ago

akashAD98 commented 1 year ago

Search before asking

Question

i have devloped script which will load custom object detector model & we can generate segmentations easily, we can save lots of our time becasue of auto segmentation

here is the repo: https://github.com/akashAD98/YOLOV8_SAM

Additional

No response

akashAD98 commented 1 year ago

Basically im geting the segmentions mask output , i have tried to convert it into normalised yolo format but im getting 1 error

im getting extra annotations lines

image image

i opened it using roboflow to see the visualisation of my mask

the script im using is

from PIL import Image
import numpy as np

# define the segmentation mask

#mask = [631, 1280, 630, 1281, 629, 1281, 628, 1282, 626, 1282, 625, 1283, 622, 1283, 621, 1284, 619, 1284, 618, 1285, 615, 1285, 614, 1286, 612, 1286, 611, 1287, 609, 1287, 608, 1288, 607, 1288, 606, 1289, 604, 1289, 603, 1290, 602, 1290, 601, 1291, 599, 1291, 598, 1292, 596, 1292, 595, 1293, 593, 1293, 592, 1294, 590, 1294, 589, 1295, 587, 1295, 586, 1296, 584, 1296, 583, 1297, 579, 1297, 578, 1298, 576, 1298, 575, 1299, 574, 1299, 573, 1300, 571, 1300, 570, 1301, 569, 1301, 568, 1302, 566, 1302, 565, 1303, 563, 1303, 562, 1304, 561, 1304, 560, 1305, 558, 1305, 557, 1306, 555, 1306, 554, 1307, 552, 1307, 551, 1308, 548, 1308, 547, 1309, 544, 1309, 543, 1310, 541, 1310, 540, 1311, 538, 1311, 537, 1312, 535, 1312, 534, 1313, 533, 1313, 532, 1314, 530, 1314, 529, 1315, 526, 1315, 525, 1316, 524, 1316, 523, 1317, 522, 1317, 521, 1318, 520, 1318, 519, 1319, 518, 1319, 516, 1321, 515, 1321, 514, 1322, 512, 1322, 511, 1323, 510, 1323, 509, 1324, 507, 1324, 506, 1325, 504, 1325, 503, 1326, 502, 1326, 501, 1327, 497, 1327, 496, 1328, 492, 1328, 491, 1329, 490, 1329, 489, 1330, 484, 1330, 483, 1331, 482, 1331, 479, 1334, 479, 1335, 478, 1336, 478, 1346, 479, 1347, 479, 1349, 480, 1350, 480, 1352, 482, 1354, 486, 1354, 487, 1355, 492, 1355, 493, 1354, 496, 1354, 497, 1353, 499, 1353, 501, 1351, 506, 1351, 507, 1352, 510, 1352, 511, 1351, 514, 1351, 515, 1350, 517, 1350, 518, 1349, 519, 1349, 520, 1348, 521, 1348, 522, 1347, 523, 1347, 524, 1346, 526, 1346, 527, 1345, 528, 1345, 529, 1344, 532, 1344, 533, 1343, 536, 1343, 537, 1342, 539, 1342, 540, 1341, 542, 1341, 543, 1340, 545, 1340, 546, 1339, 548, 1339, 549, 1338, 550, 1338, 551, 1337, 553, 1337, 554, 1336, 556, 1336, 557, 1335, 558, 1335, 559, 1334, 561, 1334, 562, 1333, 564, 1333, 565, 1332, 567, 1332, 568, 1331, 570, 1331, 571, 1330, 573, 1330, 574, 1329, 575, 1329, 576, 1328, 579, 1328, 580, 1327, 582, 1327, 583, 1326, 584, 1326, 585, 1325, 587, 1325, 588, 1324, 589, 1324, 590, 1323, 591, 1323, 592, 1322, 595, 1322, 596, 1321, 598, 1321, 599, 1320, 601, 1320, 602, 1319, 605, 1319, 606, 1318, 609, 1318, 610, 1317, 612, 1317, 613, 1316, 616, 1316, 617, 1315, 618, 1315, 619, 1314, 621, 1314, 622, 1313, 623, 1313, 625, 1311, 626, 1311, 627, 1310, 628, 1310, 629, 1309, 631, 1309, 632, 1308, 634, 1308, 635, 1307, 638, 1307, 639, 1306, 641, 1306, 642, 1305, 644, 1305, 645, 1304, 646, 1304, 647, 1303, 648, 1303, 650, 1301, 651, 1301, 652, 1300, 653, 1300, 654, 1299, 655, 1299, 658, 1296, 658, 1293, 659, 1292, 659, 1290, 656, 1287, 655, 1287, 654, 1286, 653, 1286, 652, 1285, 651, 1285, 650, 1284, 649, 1284, 648, 1283, 646, 1283, 645, 1282, 644, 1282, 643, 1281, 641, 1281, 640, 1280]
# load the image

mask =segmentation
img = Image.open("/content/Cigaretee_i_Weapon_Pragati137.jpg")
width, height = img.size

# convert mask to numpy array of shape (N,2)
mask = np.array(mask).reshape(-1,2)

# normalize the pixel coordinates
mask_norm = mask / np.array([width, height])

# compute the bounding box
xmin, ymin = mask_norm.min(axis=0)
xmax, ymax = mask_norm.max(axis=0)
bbox_norm = np.array([xmin, ymin, xmax, ymax])

# concatenate bbox and mask to obtain YOLO format
yolo = np.concatenate([bbox_norm, mask_norm.reshape(-1)])
print(yolo)
glenn-jocher commented 1 year ago

@akashAD98 it seems that you are getting extra annotations lines when converting a segmentation mask to YOLO format, and you are using the script provided. One possibility is that the segmentation mask may contain duplicated points that are causing the extra lines. It could also be a formatting issue, and I recommend checking the formatting requirements for YOLOv8. Additionally, you may find relevant information in the Ultralytics YOLOv8 documentation on how to convert segmentation masks to YOLO format.

akashAD98 commented 1 year ago

@glenn-jocher thanks for the reply. I saw & I'm not getting any script for the conversion.it is a great help if you provide reference.

im getting exact same coordinates which requires for yolov8 segmentation format. thanks

glenn-jocher commented 1 year ago

@akashAD98, you're welcome. I'm glad that the information in our documentation helped you. Converting segmentation masks to YOLO format involves normalizing the pixel coordinates, computing the bounding box, and concatenating the normalized bounding box and mask coordinates into a single array. It's important to ensure that the coordinates are properly formatted and there are no extra annotations lines. As always, I encourage you to reach out to the Ultralytics community for any further assistance.

akashAD98 commented 1 year ago

@glenn-jocher can you tell me,if my code computes bounding box is correct or not? what exactly yolo is taking

image
Segmentation: [934, 162, 933, 163, 932, 163, 932, 164, 931, 165, 931, 168, 930, 169, 930, 172, 929, 173, 929, 176, 928, 177, 928, 178, 927, 179, 927, 180, 926, 181, 926, 183, 925, 184, 925, 186, 924, 187, 924, 190, 923, 191, 923, 192, 922, 193, 922, 195, 921, 196, 921, 198, 920, 199, 920, 201, 919, 202, 919, 204, 918, 205, 918, 207, 917, 208, 917, 210, 916, 211, 916, 214, 915, 215, 915, 217, 914, 218, 914, 221, 913, 222, 913, 225, 912, 226, 912, 229, 911, 230, 911, 232, 910, 233, 910, 236, 909, 237, 909, 238, 908, 239, 908, 240, 906, 242, 906, 243, 905, 244, 905, 245, 903, 247, 903, 248, 902, 249, 902, 251, 900, 253, 900, 254, 899, 255, 899, 256, 898, 257, 898, 263, 897, 264, 897, 271, 896, 272, 896, 275, 895, 276, 895, 279, 894, 280, 894, 282, 893, 283, 893, 286, 892, 287, 892, 290, 891, 291, 891, 301, 892, 302, 892, 304, 893, 305, 893, 306, 894, 307, 894, 308, 896, 310, 896, 311, 903, 318, 904, 318, 905, 319, 907, 319, 909, 317, 909, 316, 911, 314, 911, 313, 912, 312, 912, 311, 913, 310, 913, 309, 914, 308, 914, 307, 915, 306, 915, 305, 917, 303, 917, 302, 918, 301, 918, 299, 919, 298, 919, 295, 920, 294, 920, 292, 921, 291, 921, 289, 922, 288, 922, 286, 923, 285, 923, 284, 924, 283, 924, 281, 925, 280, 925, 278, 926, 277, 926, 274, 927, 273, 927, 271, 928, 270, 928, 269, 929, 268, 929, 267, 930, 266, 930, 264, 931, 263, 931, 262, 932, 261, 932, 259, 933, 258, 933, 256, 934, 255, 934, 252, 935, 251, 935, 248, 936, 247, 936, 246, 937, 245, 937, 244, 938, 243, 938, 242, 939, 241, 939, 239, 940, 238, 940, 236, 941, 235, 941, 233, 942, 232, 942, 231, 943, 230, 943, 229, 944, 228, 944, 227, 945, 226, 945, 224, 946, 223, 946, 220, 947, 219, 947, 217, 948, 216, 948, 214, 949, 213, 949, 210, 950, 209, 950, 207, 951, 206, 951, 204, 952, 203, 952, 201, 953, 200, 953, 198, 954, 197, 954, 193, 955, 192, 955, 191, 956, 190, 956, 188, 957, 187, 957, 183, 958, 182, 958, 170, 957, 169, 957, 168, 953, 164, 952, 164, 951, 163, 948, 163, 947, 162]
    mask=segmentation

    # load the image
    #width, height = image_path.size
    img = Image.open(image_path)
    width, height = img.size

    # convert mask to numpy array of shape (N,2)
    mask = np.array(mask).reshape(-1,2)

    # normalize the pixel coordinates
    mask_norm = mask / np.array([width, height])

    # compute the bounding box
    xmin, ymin = mask_norm.min(axis=0)
    xmax, ymax = mask_norm.max(axis=0)
    bbox_norm = np.array([xmin, ymin, xmax, ymax])

    # concatenate bbox and mask to obtain YOLO format
    yolo = np.concatenate([bbox_norm, mask_norm.reshape(-1)])
akashAD98 commented 1 year ago

@glenn-jocher @Laughing-q @AyushExel can you please tell me what the correct way to convert into Yolo mask format?

why im getting extra bboxex, lines

image image
    mask=[[934, 162, 933, 163, 932, 163, 932, 164, 931, 165, 931, 168, 930, 169, 930]

    # load the image
    #width, height = image_path.size
    img = Image.open(image_path)
    width, height = img.size

    # convert mask to numpy array of shape (N,2)
    mask = np.array(mask).reshape(-1,2)

    # normalize the pixel coordinates
    mask_norm = mask / np.array([width, height])

    # compute the bounding box
    xmin, ymin = mask_norm.min(axis=0)
    xmax, ymax = mask_norm.max(axis=0)
    bbox_norm = np.array([xmin, ymin, xmax, ymax])

    # concatenate bbox and mask to obtain YOLO format
    yolo = np.concatenate([bbox_norm, mask_norm.reshape(-1)])

    # write the yolo values to a text file
    with open('yolo_cigaNEW_137_MASKw.txt', 'w') as f:
        for val in yolo:
            f.write("{:.6f} ".format(val))
AyushExel commented 1 year ago

You can use this file.. Just run it on the coco format dataset and it'll give you the yolo trainable format dataset https://github.com/ultralytics/JSON2YOLO/blob/master/general_json2yolo.py

akashAD98 commented 1 year ago

@AyushExel

    mask=[321,123,431,123,..]

    # load the image
    #width, height = image_path.size
    img = Image.open(image_path)
    width, height = img.size

    # convert mask to numpy array of shape (N,2)
    mask = np.array(mask).reshape(-1,2)

    # normalize the pixel coordinates
    mask_norm = mask / np.array([width, height])

    # compute the bounding box
    xmin, ymin = mask_norm.min(axis=0)
    xmax, ymax = mask_norm.max(axis=0)
    #bbox_norm = np.array([xmin, ymin, xmax, ymax])
    bbox_norm = np.array([xmin, ymin,xmax - xmin, ymax - ymin])

    # normalize the bounding box by width and height
    bbox_norm[[0, 2]] /= width  # normalize x by width
    bbox_norm[[1, 3]] /= height  # normalize y by height

    # concatenate bbox and mask to obtain YOLO format
    yolo = np.concatenate([bbox_norm, mask_norm.reshape(-1)])

i tried few code part from that coco2json.py file & checked ,now still getting this issue, the last point is not touching to object

image
AyushExel commented 1 year ago

@akashAD98 okay not sure what's happening here but the easiest solution would be if you just remove the last segment point manually because there is always one extra point that is causing problem right?

akashAD98 commented 1 year ago

@AyushExel i tried it , still getting this line, now if you can clearly see the mask of object is completed here

image image

also tried to remove more points,still same, its the ROboflow problem or my code problem? all mask is drawing correctly but last one is causing issue, how can i verify my annotations are correct or not.

is there any way to close this mask? code is also small in your free time can you check it

glenn-jocher commented 1 year ago

@akashAD98, it seems that the issue with the segmentation mask is that the last point is not connected to the first point. This could be causing the extra line in the YOLOv8 format. One solution is to manually remove the last point or try to connect the last point to the first point such that the mask is a closed polygon. Alternatively, you can verify your annotations by plotting the segmentation mask on top of the original image and manually inspecting the mask for accuracy. You can also try using other annotation tools or re-annotate the object to see if the issue persists.

akashAD98 commented 1 year ago

@glenn-jocher thanks. my code is correct or not, its problem with code or SAM output

glenn-jocher commented 1 year ago

@akashAD98 the code that you have provided seems to be correct and following the standard steps for converting a segmentation mask to YOLO format. However, there might be an issue with the segmentation mask where the last point is not connected to the first point, which is causing the problem. It is advisable to verify your annotations for accuracy and completeness, and if needed, manually remove the last point or try to connect the last point with the first point to make the mask a closed polygon. This will ensure that the YOLO format is generated without any extra lines or issues.

Laughing-q commented 1 year ago

is there any way to close this mask? code is also small in your free time can you check it

@akashAD98 can you share the whole reproducible code block we can use by directly copy-paste? i.e the code of save the mask and how you plot it.

akashAD98 commented 1 year ago

@Laughing-q here is all code in the collab, https://github.com/akashAD98/YOLOV8_SAM/blob/main/YOLOV8_SAM_Single_MultiObject.ipynb

please let me know whats wrong with it .Thanks a lot for help

akashAD98 commented 1 year ago

tested on multi images

image
glenn-jocher commented 1 year ago

@akashAD98 your results from the multiple images look promising! As for the issue with the segmentation masks, it seems that the problem lies with the creation of the mask itself rather than the code for the conversion of the mask to YOLO format. To debug this, you can try manually inspecting the mask and see where the problem lies. You can also try using a different annotation tool or re-annotate the object to see if the issue persists. Once you have verified the accuracy and completeness of the mask, you can use the code you have provided for converting the mask to YOLO format. Keep up the good work!

akashAD98 commented 1 year ago

@glenn-jocher apart from roboflow is there any other open source tool that support YOLO annotations??

i tried to verify & im getting zero normalised coordinates & because of that im getting this issue

Laughing-q commented 1 year ago

@akashAD98 hey, I think that's because you're also save bbox in your mask annotations which we do not need. image it should be:

yolo = mask_norm.reshape(-1)
akashAD98 commented 1 year ago

@Laughing-q Sorry im not getting you.

image

you want me to do like this? i tried it & results were very weird,

image
glenn-jocher commented 1 year ago

@akashAD98 my apologies for the confusion. I was trying to say that while converting the mask to the YOLO format, it is important to extract the normalized pixel coordinates of the object from the segmentation mask without including unnecessary information such as bbox coordinates. The bbox coordinates are not required for the YOLO format and may affect the accuracy of the conversion. You can try to remove the bbox coordinates from your segmentation mask and then extract the normalized pixel coordinates for the YOLO format. Also, there are several open source annotation tools available such as LabelBox, VoTT, and LabelImg that support YOLO annotations.

akashAD98 commented 1 year ago

@glenn-jocher without bbox im not getting correct output,which i shared in an earlier comment, i tried only

yolo = mask_norm.reshape(-1)
glenn-jocher commented 1 year ago

@akashAD98, while creating segmentation masks for the YOLO format, you should only include the pixel coordinates of the object of interest and remove any additional information such as bounding box coordinates or annotations. The YOLO format requires normalized pixel coordinates and including unwanted information may cause issues with the conversion process. However, it is essential to make sure that the segmentation masks are accurate and complete to ensure high-quality results. Additionally, there are several open source annotation tools available that support YOLO annotations, such as LabelImg, LabelBox, and VoTT.

SkalskiP commented 1 year ago

@akashAD98 👋🏻 here is the video I was talking about: https://youtu.be/oEQYStnF2l8 I finally made it 🔥

AyushExel commented 1 year ago

@akashAD98 opened an issue on the repo - https://github.com/akashAD98/YOLOV8_SAM/issues/2 Just point me to the part that's causing problems. I'll take a look tomorrow and let's try to get the entire labeler working over the weekend

github-actions[bot] commented 1 year ago

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐