ultralytics / yolov5

YOLOv5 πŸš€ in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.56k stars 16.31k forks source link

YOLOv5 Segmentation Overlapping Objects Annotation #11326

Closed rukshankr closed 1 year ago

rukshankr commented 1 year ago

Search before asking

Question

I am training YOLOv5 for a food items dataset but most food items are largely overlapping each other in the images.

Therefore I annotated the dataset with the occluded food images annotated to their original shape. Like this: image

Is this approach correct? I want to get segmentation masks that are as close to the original shape as possible. What else can I do to improve the accuracy of the predictions?

Additional

No response

github-actions[bot] commented 1 year ago

πŸ‘‹ Hello @rukshankr, thank you for your interest in YOLOv5 πŸš€! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a πŸ› Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Requirements

Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

YOLOv5 CI

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

Introducing YOLOv8 πŸš€

We're excited to announce the launch of our latest state-of-the-art (SOTA) object detection model for 2023 - YOLOv8 πŸš€!

Designed to be fast, accurate, and easy to use, YOLOv8 is an ideal choice for a wide range of object detection, image segmentation and image classification tasks. With YOLOv8, you'll be able to quickly and accurately detect objects in real-time, streamline your workflows, and achieve new levels of accuracy in your projects.

Check out our YOLOv8 Docs for details and get started with:

pip install ultralytics
glenn-jocher commented 1 year ago

@rukshankr, if you want to get segmentation masks that are as close to the original shape as possible, annotating occluded food images to their original shape is a valid approach. However, if the items are overlapping too much, you may want to consider separating these items into their own bounding boxes for better accuracy.

You may also want to try experimenting with different augmentation strategies during training, such as mixup or mosaic augmentation, to improve the model's ability to detect objects in overlapping situations. Additionally, it may be helpful to try training on a smaller subset of the dataset first to find the best model architecture and hyperparameters before training on the entire dataset.

rukshankr commented 1 year ago

@glenn-jocher thank you so much for the advice. I will try the methods that you have mentioned.

glenn-jocher commented 1 year ago

You're welcome, @rukshankr! Please let us know if you have any further questions or run into any issues during your training process. We wish you good luck with your project!

vtyw commented 1 year ago

@glenn-jocher I'm doing something very similar to this and running into problems. What we're doing is equivalent to labeling donuts like rukshankr's example, but adding an additional object which is the hole of the donut. The detected donut masks exclude the donut hole even though they are annotated to include everything. Similar issue with overlap, the masks produced generally do not overlap with other detected masks of different classes even though the training data has them overlapping.

Seems that some part of yolov5 mask detection prevents masks from overlapping. I suspect this is occurring both before training (because the raw detections also give me donut objects with holes so they must have been trained on donuts with holes), and also additionally in post-processing of detections, perhaps due to NMS.

vtyw commented 1 year ago

This is the same issue as #10433 about whether yolov5 can include the same pixel as part of more than one object when that's how it's annotated in the training data. I don't believe there is such a restriction in the network itself, it definitely can detect masks that overlap with other objects.

glenn-jocher commented 1 year ago

@vtyw hello! Thank you for bringing up your concerns about YOLOv5. You are correct that YOLOv5 is fundamentally capable of detecting masks that overlap with other objects. However, it is possible that the post-processing steps in the detection pipeline, such as NMS, may be causing the issue you are seeing.

You mentioned that objects with holes are being detected but the holes are being excluded from the mask. This could be due to the network's ability to detect the hole, but post-processing such as NMS removing some of the masks with the overlapping object. One possible solution you could try is to adjust the NMS threshold in the detection pipeline to allow for more overlap between masks.

As you mentioned, there is an existing Issue #10433 in the YOLOv5 repo regarding this topic so I would suggest following any further discussion or developments in that thread as well.

Please let us know if you have any further questions or if there's anything more we can do to assist you!

vtyw commented 1 year ago

@glenn-jocher NMS operates per class so cannot be the cause of this issue.

I did some more exploring and here's what I found:

glenn-jocher commented 1 year ago

Hello @vtyw, thank you for your further exploration and sharing your findings about the YOLOv5 detection pipeline. We appreciate you taking the time to investigate and report your results in detail.

Based on your findings, it's clear that there are a few things to be aware of when training and evaluating a YOLOv5 model for datasets with overlapping object instances. Specifically, it's necessary to use the --no-overlap flag when training to allow the network to learn overlapping parts of objects. However, doing so may cause the mask metrics reported during training to be inflated by a few percent, which can be corrected by running val.py separately to obtain the true performance. It's also interesting to note that setting --mask-ratio to 1 can actually decrease the final mask metrics in certain situations.

We appreciate you sharing your insights with the community and we hope that your findings will help others to achieve better results when using YOLOv5. If you have any further questions or if there's anything else we can do to assist you, please don't hesitate to let us know!

ryouchinsa commented 11 months ago

Using the script general_json2yolo.py, you can convert the RLE mask with holes to the YOLO segmentation format.

The RLE mask is converted to a parent polygon and a child polygon using cv2.findContours(). The parent polygon points are sorted in clockwise order. The child polygon points are sorted in counterclockwise order. Detect the nearest point in the parent polygon and in the child polygon. Connect those 2 points with narrow 2 lines. So that the polygon with a hole is saved in the YOLO segmentation format.

def is_clockwise(contour):
    value = 0
    num = len(contour)
    for i, point in enumerate(contour):
        p1 = contour[i]
        if i < num - 1:
            p2 = contour[i + 1]
        else:
            p2 = contour[0]
        value += (p2[0][0] - p1[0][0]) * (p2[0][1] + p1[0][1]);
    return value < 0

def get_merge_point_idx(contour1, contour2):
    idx1 = 0
    idx2 = 0
    distance_min = -1
    for i, p1 in enumerate(contour1):
        for j, p2 in enumerate(contour2):
            distance = pow(p2[0][0] - p1[0][0], 2) + pow(p2[0][1] - p1[0][1], 2);
            if distance_min < 0:
                distance_min = distance
                idx1 = i
                idx2 = j
            elif distance < distance_min:
                distance_min = distance
                idx1 = i
                idx2 = j
    return idx1, idx2

def merge_contours(contour1, contour2, idx1, idx2):
    contour = []
    for i in list(range(0, idx1 + 1)):
        contour.append(contour1[i])
    for i in list(range(idx2, len(contour2))):
        contour.append(contour2[i])
    for i in list(range(0, idx2 + 1)):
        contour.append(contour2[i])
    for i in list(range(idx1, len(contour1))):
        contour.append(contour1[i])
    contour = np.array(contour)
    return contour

def merge_with_parent(contour_parent, contour):
    if not is_clockwise(contour_parent):
        contour_parent = contour_parent[::-1]
    if is_clockwise(contour):
        contour = contour[::-1]
    idx1, idx2 = get_merge_point_idx(contour_parent, contour)
    return merge_contours(contour_parent, contour, idx1, idx2)

def mask2polygon(image):
    contours, hierarchies = cv2.findContours(image, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_TC89_KCOS)
    contours_approx = []
    polygons = []
    for contour in contours:
        epsilon = 0.001 * cv2.arcLength(contour, True)
        contour_approx = cv2.approxPolyDP(contour, epsilon, True)
        contours_approx.append(contour_approx)

    contours_parent = []
    for i, contour in enumerate(contours_approx):
        parent_idx = hierarchies[0][i][3]
        if parent_idx < 0 and len(contour) >= 3:
            contours_parent.append(contour)
        else:
            contours_parent.append([])

    for i, contour in enumerate(contours_approx):
        parent_idx = hierarchies[0][i][3]
        if parent_idx >= 0 and len(contour) >= 3:
            contour_parent = contours_parent[parent_idx]
            if len(contour_parent) == 0:
                continue
            contours_parent[parent_idx] = merge_with_parent(contour_parent, contour)

    contours_parent_tmp = []
    for contour in contours_parent:
        if len(contour) == 0:
            continue
        contours_parent_tmp.append(contour)

    polygons = []
    for contour in contours_parent_tmp:
        polygon = contour.flatten().tolist()
        polygons.append(polygon)
    return polygons 

def rle2polygon(segmentation):
    if isinstance(segmentation["counts"], list):
        segmentation = mask.frPyObjects(segmentation, *segmentation["size"])
    m = mask.decode(segmentation) 
    m[m > 0] = 255
    polygons = mask2polygon(m)
    return polygons

The RLE mask.

γ‚Ήγ‚―γƒͺγƒΌγƒ³γ‚·γƒ§γƒƒγƒˆ 2023-11-22 1 57 52

The converted YOLO segmentation format.

γ‚Ήγ‚―γƒͺγƒΌγƒ³γ‚·γƒ§γƒƒγƒˆ 2023-11-22 2 11 14

To run the script, put the COCO JSON file coco_train.json into datasets/coco/annotations. Run the script. python general_json2yolo.py The converted YOLO txt files are saved in new_dir/labels/coco_train.

γ‚Ήγ‚―γƒͺγƒΌγƒ³γ‚·γƒ§γƒƒγƒˆ 2023-11-23 16 39 21

Edit use_segments and use_keypoints in the script.

if __name__ == '__main__':
    source = 'COCO'

    if source == 'COCO':
        convert_coco_json('../datasets/coco/annotations',  # directory with *.json
                          use_segments=True,
                          use_keypoints=False,
                          cls91to80=False)

To convert the COCO bbox format to YOLO bbox format.

use_segments=False,
use_keypoints=False,

To convert the COCO segmentation format to YOLO segmentation format.

use_segments=True,
use_keypoints=False,

To convert the COCO keypoints format to YOLO keypoints format.

use_segments=False,
use_keypoints=True,

This script originates from Ultralytics JSON2YOLO repository. We hope this script would help your business.

glenn-jocher commented 11 months ago

@ryouchinsa thank you for sharing the detailed script and instructions for converting RLE masks to YOLO segmentation format. The method you've outlined for converting the masks using general_json2yolo.py seems comprehensive and should be helpful for those working with YOLO segmentation format in their projects.

It's great to see the innovation in the community in finding solutions to specific use cases. It's also beneficial moving forward to be able to convert data into the required format for training on YOLOv5.

It's important to note that while the provided script has originated from the Ultralytics JSON2YOLO repository, the specific implementation for converting RLE masks to YOLO segmentation format as described, seems to be a custom addition.

Your contribution to the community is appreciated, and we hope others find the script useful for their projects. If you have any further insights or updates, please feel free to share with the community.

Thank you for your contribution!