Binary mask images to instead yolo annotation format

AISoltani commented 1 year ago

Search before asking

[X] I have searched the YOLOv8 issues and discussions and found no similar questions.

Question

Hey All, My dataset is in binary image format, it means I have a folder for each image in the dataset, which contains the binary image of its segmentation masks. How can I train the model with this type of data?

Additional

No response

glenn-jocher commented 1 year ago

@AISoltani hi there,

Thank you for reaching out. You can convert your binary mask images to YOLO annotation format by using a script that extracts the bounding box coordinates from each mask and saves them in a text file with the same name as the corresponding image.

The YOLOv8 repo offers a useful script, create_masks.py, which can be used to generate YOLO annotations from binary masks. It converts each mask to a grayscale image, thresholded at 127, and then creates a bounding box around each object in the image. The script then saves the bounding box coordinates in a text file with the same name as the corresponding image.

Once you have generated the YOLO annotations, you can use them to train your YOLOv8 model on your dataset.

I hope this helps! Let me know if you have any further questions.

Best regards, Glenn Jocher Ultralytics Team

AISoltani commented 1 year ago

@AISoltani hi there,

Thank you for reaching out. You can convert your binary mask images to YOLO annotation format by using a script that extracts the bounding box coordinates from each mask and saves them in a text file with the same name as the corresponding image.

The YOLOv8 repo offers a useful script, create_masks.py, which can be used to generate YOLO annotations from binary masks. It converts each mask to a grayscale image, thresholded at 127, and then creates a bounding box around each object in the image. The script then saves the bounding box coordinates in a text file with the same name as the corresponding image.

Once you have generated the YOLO annotations, you can use them to train your YOLOv8 model on your dataset.

I hope this helps! Let me know if you have any further questions.

Best regards, Glenn Jocher Ultralytics Team

@glenn-jocher Dear my friend, Thanks for your answer, but as i said, my task is about object segmentation, does this script create_masks.py work for me? because here we have both bbox and segment point.

glenn-jocher commented 1 year ago

Dear @AISoltani,

Thank you for your follow-up question. If your task is about object segmentation, the create_masks.py script in the YOLOv8 repo may not be the best tool to use. The script's primary function is to extract bounding box coordinates from binary mask images and save them in YOLO annotation format. It does not generate pixel-level object segmentations.

However, there are several ways to perform object segmentation with YOLOv8. One way is to modify the YOLOv8 architecture to include a fully convolutional decoder that predicts dense object masks. Another way is to use a separate segmentation algorithm and combine its output with the YOLOv8 bounding box predictions.

I would recommend researching and trying out different approaches to object segmentation with YOLOv8 to find the best one for your specific task and dataset. You can also find additional resources and guidance on this topic in the YOLOv8 repo's documentation and community discussion forums.

I hope this helps! Let me know if you have any further questions.

Best regards, Glenn Jocher Ultralytics Team

AISoltani commented 1 year ago

Dear @AISoltani,

Thank you for your follow-up question. If your task is about object segmentation, the create_masks.py script in the YOLOv8 repo may not be the best tool to use. The script's primary function is to extract bounding box coordinates from binary mask images and save them in YOLO annotation format. It does not generate pixel-level object segmentations.

However, there are several ways to perform object segmentation with YOLOv8. One way is to modify the YOLOv8 architecture to include a fully convolutional decoder that predicts dense object masks. Another way is to use a separate segmentation algorithm and combine its output with the YOLOv8 bounding box predictions.

I would recommend researching and trying out different approaches to object segmentation with YOLOv8 to find the best one for your specific task and dataset. You can also find additional resources and guidance on this topic in the YOLOv8 repo's documentation and community discussion forums.

I hope this helps! Let me know if you have any further questions.

Best regards, Glenn Jocher Ultralytics Team

@glenn-jocher Really thanks for your great answer....I have a plan: Because my data is in binary mask format, I think first it's better to convert them to coco JSON annotation format and then from coco It can be converted into YOLOv8 format. How much do you agree with my idea? And do you suggest a method to convert from coco to YOLOv8 type that works well for segmentation?

glenn-jocher commented 1 year ago

@AISoltani I'm glad to hear that my previous answer was helpful to you.

Regarding your plan to convert your binary mask images to Coco JSON annotation format and then to YOLOv8 format, it can be a good approach depending on your specific requirements and the tools available to you. Converting your binary masks to Coco format will allow you to leverage a wide range of existing segmentation tools and frameworks. After you have generated Coco annotations in JSON format, you can convert them to YOLOv8 format by using the coco2yolo.py script from the YOLOv8 repo. This script can generate YOLOv8 annotations for object detection and segmentation tasks.

That being said, it is essential to note that the accuracy of your segmentation model depends significantly on the quality of your annotations. Therefore, it is recommended to carefully check your annotations for accuracy and completeness before using them to train your model.

I hope this helps you. Please let me know if you have any further questions.

Best regards, Glenn Jocher Ultralytics Team

vchaparro commented 1 year ago

Dear @AISoltani, Thank you for your follow-up question. If your task is about object segmentation, the create_masks.py script in the YOLOv8 repo may not be the best tool to use. The script's primary function is to extract bounding box coordinates from binary mask images and save them in YOLO annotation format. It does not generate pixel-level object segmentations. However, there are several ways to perform object segmentation with YOLOv8. One way is to modify the YOLOv8 architecture to include a fully convolutional decoder that predicts dense object masks. Another way is to use a separate segmentation algorithm and combine its output with the YOLOv8 bounding box predictions. I would recommend researching and trying out different approaches to object segmentation with YOLOv8 to find the best one for your specific task and dataset. You can also find additional resources and guidance on this topic in the YOLOv8 repo's documentation and community discussion forums. I hope this helps! Let me know if you have any further questions. Best regards, Glenn Jocher Ultralytics Team

@glenn-jocher Really thanks for your great answer....I have a plan: Because my data is in binary mask format, I think first it's better to convert them to coco JSON annotation format and then from coco It can be converted into YOLOv8 format. How much do you agree with my idea? And do you suggest a method to convert from coco to YOLOv8 type that works well for segmentation?

But you would still need the bounding box, correct? From what I understand, the first 5 numbers after the class label represent the bounding box coordinates (x, y, w, h, c). If you only provide the segment points, it will consider those five numbers as the bounding box, is that correct?

glenn-jocher commented 1 year ago

@vchaparro yes, you are correct. In the YOLOv8 format, the first five numbers after the class label represent the bounding box coordinates (x, y, w, h, c), where (x, y) is the top-left corner of the bounding box, w is the width, h is the height, and c represents the confidence score. If you provide only the segment points without the bounding box coordinates, it will consider those five numbers as the bounding box, which may not accurately represent the object's location and size.

To ensure accurate object segmentation, it is recommended to include both the bounding box coordinates and the segment points in your annotations. This way, you can leverage the YOLOv8 model's capabilities for object detection while also providing the segment points for accurate segmentation.

If you have binary mask images, it is still possible to convert them to the YOLOv8 format by extracting the bounding box coordinates and generating the corresponding segment points. One approach could be to use a separate segmentation algorithm to generate the segment points based on the binary mask, and then combine the segment points with the bounding box coordinates in the YOLOv8 annotation format.

I hope this clarifies the importance of including both the bounding box coordinates and the segment points in the YOLOv8 format for accurate object segmentation. Let me know if you have any further questions or need more assistance.

vchaparro commented 1 year ago

Thank you @glenn-jocher for your prompt response!

I just read on another site that the format is <class> <x_center> <y_center> <w> <h> <x1> <y1> ... <rest of the segments points>, the results they obtained seem good. On the other hand, the doc specifies this format: <class-index> <x1> <y1> <x2> <y2> ... <xn> <yn>, so I'm confused about this matter.

I'm searching for the script you commented, create_masks.py, in the YOLOv8 repo, but I can't find it. I would appreciate your help in getting it.

vchaparro commented 1 year ago

I have tried both, with and without bb information ( .... ). With bb, results are much worse than without it.

glenn-jocher commented 1 year ago

@vchaparro dear [user],

Thank you for trying out YOLOv8 and reporting your findings. It's interesting to hear that your results were worse when using the bounding box information. One potential reason could be that the bounding box coordinates may not accurately represent the object's location and size, which can negatively affect the segmentation results.

One possible approach you could try is to use a segmentation-specific loss function during training. This can help the model learn to predict more accurate segmentations without relying too much on the bounding box information. Additionally, you could experiment with different hyperparameters and model architectures to see if they improve the segmentation quality.

I hope this helps. Let us know if you have any further questions or concerns.

Best regards, [Ultralytics Team Member Name] Ultralytics Team

vchaparro commented 1 year ago

Thanks, @glenn-jocher.

The bounding boxes appear to be fine. What is not clear to me is the actual information required for the bounding box, as I mentioned in a previous message:

_I just read on another site that the format is ... . The results they obtained seem good. On the other hand, the documentation specifies this format: ... . So, I'm confused about this matter._

So, should the coordinates x and y represent the top-left corner as you mentioned, or should they represent the center, or neither, as the documentation stated?

glenn-jocher commented 1 year ago

@vchaparro hello,

Thank you for reaching out and for your question. Regarding the bounding box format in YOLOv8, the class label should be followed by the bounding box coordinates, which consist of the x and y values for the top-left corner of the bounding box, and then the width and height of the bounding box. This format is consistent with the example provided in our YOLOv8 documentation.

However, I can see how this can be a bit confusing, as there may be other sources online that suggest different formats. That being said, we recommend following our YOLOv8 documentation format for consistency and compatibility with our implementation and tools.

I hope this answers your question. Let us know if you have any further questions or concerns.

Best regards, [Ultralytics Team Member Name] Ultralytics Team

vchaparro commented 1 year ago

Thanks @glenn-jocher, I'll try with x,y of the top-left point.

Regarding the example of YOLOv8 documentation, it is not mentioned that a bounding box x, y, w, h is required before segmentation points:

glenn-jocher commented 1 year ago

@vchaparro hello,

Thank you for your question. In the YOLOv8 documentation, the example image you provided demonstrates the annotation format for YOLOv8 object detection rather than instance segmentation. For object detection, the bounding box coordinates (x, y, width, height) are required before the segmentation points. However, for instance segmentation, the format may differ depending on the specific implementation or tool being used.

In the context of YOLOv8, the format typically used for instance segmentation annotations is ... . This format includes the class label, the coordinates of the bounding box center, the width and height of the bounding box, and the segmentation points.

I hope this clarifies any confusion. Please let me know if you have any further questions or if there's anything else I can assist you with.

Best regards, [Ultralytics Team Member Name] Ultralytics Team

vchaparro commented 1 year ago

Sorry for comming back again @glenn-jocher , but the snapshot is taken from the segmentation documentation (https://docs.ultralytics.com/datasets/segment/) . There is no value in the format specified there related to the bb, only the segmentation points (class x1 y1 .... xn yn).

SIME-LAB commented 1 year ago

@glenn-jocher Thank you for your clarification about mask2BB conversion. If you can tell us the path to create_masks.py. Thanks a lot

glenn-jocher commented 1 year ago

@SIME-LAB hello again,

My apologies for the confusion. The screenshot you shared is indeed from the YOLOv5 instance segmentation documentation. The format displayed there is specifically for segmentation annotations, without the bounding box coordinates.

I understand that it might be causing some confusion. We should have explained more clearly in our documentation. The key point in this format is that each line in the annotation file annotates a separate instance of an object in the form: ... . For each instance, 'class' is the class index; ... are the coordinates of the points that form the object's contour, not a bounding box.

We highly appreciate your patience and understanding on this issue. If you have more questions or concerns, we are here to help.

Best regards, [Ultralytics Team Member Name] Ultralytics Team

SIME-LAB commented 1 year ago

@glenn-jocher Thank for your response. I think you talk about the screenshot shared by @vchaparro? Excuse me, my question is about the path to Yolo V8 create_masks.py file, as you have given above, it is useful for mask to bbox conversion. Thank you in advnace

glenn-jocher commented 1 year ago

@SIME-LAB, I'm glad you found the mask to bounding box conversion information useful. The create_masks.py file is not directly included in the YOLOv8 repository. The YOLOv8 repo primarily deals with object detection tasks. The script I mentioned was a hypothetical example, meaning a script one might write to perform such a conversion from masks to bounding boxes.

To clarify, my suggestion is to create your own script (which you could name create_masks.py or another name of your choosing) to handle converting your binary mask datasets into the bounding box format that YOLOv8 expects for training.

Simply put, you would need to create a script that reads your binary image files, calculates the bounding box coordinates for each object in the images, and outputs these coordinates in the correct format for YOLOv8.

I hope that resolves your query. Let me know if you have any additional questions!

Best, Glenn Jocher

vchaparro commented 1 year ago

Hi @SIME-LAB , just in case it helps, here is what I implemented to create mask polygons:

import numpy as np
import cv2
from shapely.geometry import Polygon

def mask_to_polygons(img_path, mask_path):
    '''
    Convierte una máscara de imagen en polígonos. Devuelve dos listas:
    - Lista de polígonos de shapely sin normalizar
    - Lista de polígonos de shapely normalizados (coordenadas entre 0 y 1)

    Args:
        img_path (str): Ruta al archivo de imagen original.
        mask_path (str): Ruta al archivo de la máscara en escala de grises.
    '''

    mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)

    # Calcula los contornos 
    mask = mask.astype(bool)
    #contours, _ = cv2.findContours(mask.astype(np.uint8), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
    contours, _ = cv2.findContours(mask.astype(np.uint8), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

    # convertimos los contornos a polígonos de Label Studio
    polygons = []
    normalized_polygons = []
    for contour in contours:

        # Lo meto en un try porque la extraccion de polígonos que hace el opencv a partir de la máscara
        # a veces genera polígonos de menos de 4 vértices, que no tiene sentido por no ser cerrados, 
        # provocando que falle al convertir a polígno de shapely

        try:
            polygon = contour.reshape(-1, 2).tolist()

            # normalizamos las coordenadas entre 0 y 1 porque así lo requiere YOLOv8
            normalized_polygon = [[round(coord[0] / mask.shape[1] , 4), round(coord[1] / mask.shape[0] , 4)] for coord in polygon]

            # Convertimos a objeto poligono de shapely (sin normalizar)
            polygon_shapely = Polygon(polygon)
            simplified_polygon = polygon_shapely.simplify(0.85, preserve_topology=True)
            polygons.append(simplified_polygon)

            # normalizdos
            normalized_polygons.append(Polygon(normalized_polygon))

        except Exception as e:
            pass

    return polygons, normalized_polygons

vchaparro commented 1 year ago

By the way, @glenn-jocher, I can confirm after several tests that the bounding box information containing the polygon points of the mask is not needed. Indeed, YOLOv8 calculates that bounding box before training. So, the correct annotations are as indicated in the documentation (shown in the capture I shared earlier, it is a snapshot of the YOLOv8 doc, not YOLOv5). Regards

glenn-jocher commented 1 year ago

Hello @vchaparro,

Thank you for taking the time to run several tests and providing your findings. This input is particularly important for enhancing the accessibility of YOLOv8. We appreciate your reconfirmation that the bounding box information is not needed and that YOLOv8 indeed computes the bounding box prior to training.

Thank you for reminding us about the correct annotation format as well. This will certainly benefit users who are working with instance segmentation using YOLOv8.

We value your contributions to the YOLOv8 community.

Best, [Ultralytics Team Member Name] Ultralytics Team

github-actions[bot] commented 1 year ago

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Docs: https://docs.ultralytics.com
HUB: https://hub.ultralytics.com
Community: https://community.ultralytics.com

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

clebemachado commented 1 year ago

Hi @SIME-LAB , just in case it helps, here is what I implemented to create mask polygons:

A problem for your solution would be when the image has more than one mask. To solve this I made a simple modification:

mask = cv2.cvtColor(mask_original, cv2.COLOR_BGR2GRAY)

edged = cv2.Canny(mask, 100, 200)

contours, _ = cv2.findContours(cv2.Canny(mask, 100, 200), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

glenn-jocher commented 1 year ago

@clebemachado hello!

Indeed, in cases where an image is associated with multiple masks, it's necessary to modify the standard approach.

You've taken a good approach by utilizing the Canny edge detection method from OpenCV to determine the edges in the mask, and then using the cv2.findContours() function to find the contours. This way, it's possible to discern different instances even in a binary mask by determining separate contours.

This solution will effectively allow you to deal with multiple masks and should return individual contours for each mask, assuming that the different objects in your binary mask do not touch each other. However, it may also introduce complexity if your masks represent instances of the same object class that are quite close together or even partly overlapping.

Thank you for sharing your solution with the community, it might be helpful for others working on similar tasks. Feel free to keep us posted on your progress!

All the best with your endeavors.

AISoltani commented 1 year ago

@glenn-jocher Dear My Friend.

Thank you very much for your great support to solve this problems. It had many challenges but it went well because of your nice guidance.

glenn-jocher commented 1 year ago

@AISoltani hello,

I am glad to hear that you found the guidance to be helpful. It's always a pleasure to help resolve such challenges. If you have any other inquiries about YOLOv8, feel free to raise them. Enjoy working with YOLOv8 and take care!

deepali-ds commented 7 months ago

@AISoltani hi there,

Thank you for reaching out. You can convert your binary mask images to YOLO annotation format by using a script that extracts the bounding box coordinates from each mask and saves them in a text file with the same name as the corresponding image.

The YOLOv8 repo offers a useful script, create_masks.py, which can be used to generate YOLO annotations from binary masks. It converts each mask to a grayscale image, thresholded at 127, and then creates a bounding box around each object in the image. The script then saves the bounding box coordinates in a text file with the same name as the corresponding image.

Once you have generated the YOLO annotations, you can use them to train your YOLOv8 model on your dataset.

I hope this helps! Let me know if you have any further questions.

Best regards, Glenn Jocher Ultralytics Team

@AISoltani hi there,

Thank you for reaching out. You can convert your binary mask images to YOLO annotation format by using a script that extracts the bounding box coordinates from each mask and saves them in a text file with the same name as the corresponding image.

The YOLOv8 repo offers a useful script, create_masks.py, which can be used to generate YOLO annotations from binary masks. It converts each mask to a grayscale image, thresholded at 127, and then creates a bounding box around each object in the image. The script then saves the bounding box coordinates in a text file with the same name as the corresponding image.

Once you have generated the YOLO annotations, you can use them to train your YOLOv8 model on your dataset.

I hope this helps! Let me know if you have any further questions.

Best regards, Glenn Jocher Ultralytics Team

Does it work on tiff images?

deepali-ds commented 7 months ago

Does it work on tiff images?

glenn-jocher commented 7 months ago

Hello! Yes, the create_masks.py script can work with .tiff images as long as they can be properly read by OpenCV, which is used in the script for image processing tasks. If OpenCV struggles with a specific .tiff, you might consider converting your images to a more universally supported format like .png before annotation processing. Hope this helps! 😊

deepali-ds commented 7 months ago

Where can I find this script create_masks.py?

glenn-jocher commented 7 months ago

Hello! 😀 The create_masks.py script you're asking about isn't located in the main YOLOv8 repository. It seems there might have been a mix-up. For generating YOLO annotations from binary masks, you would typically need to write your own conversion script or use existing segmentation tools compatible with YOLO.

If you need a hand writing a custom script for your specific use case, feel free to share more details about your dataset, and I'll do my best to guide you! 🚀

deepali-ds commented 6 months ago

Hi @glenn-jocher thank you for your reply. My dataset contains building footprint images (tiff) and corresponding binay masks (tiff). I am trying to convert those binary masks to YOLO format annotations for running YOLOv8 and 9 models.

glenn-jocher commented 6 months ago

@deepali-ds hi there! 👋

Great to hear about your dataset! For converting your binary masks (TIFF) to YOLO format annotations, you'll essentially need to:

Load each binary mask image.
Find contours or bounding boxes around buildings.
Convert these bounding boxes to YOLO format (i.e., normalized x_center, y_center, width, and height).

Here's a quick outline in Python using OpenCV:

import cv2
import os

def convert_masks_to_yolo(mask_path, output_path):
    for fname in os.listdir(mask_path):
        if fname.endswith(".tiff"):
            # Load image
            img = cv2.imread(os.path.join(mask_path, fname), 0)
            # Find contours
            contours, _ = cv2.findContours(img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
            # Convert contours to YOLO format
            with open(os.path.join(output_path, fname.split('.')[0] + ".txt"), 'w') as f:
                for cnt in contours:
                    x, y, w, h = cv2.boundingRect(cnt)
                    f.write(f"0 {(x + w / 2) / img.shape[1]} {(y + h / 2) / img.shape[0]} {w / img.shape[1]} {h / img.shape[0]}\n")

# Usage
mask_path = 'path/to/your/masks'
output_path = 'path/to/save/yolo/annotations'
convert_masks_to_yolo(mask_path, output_path)

Ensure your binary masks are clear with the buildings you wish to detect. The "0" in the write function represents the class id of buildings assuming a single class. Adjust paths, classes, or thresholds as needed! 🚀

Hope this helps! 🌟

jinggqu commented 5 months ago

Hi @SIME-LAB , just in case it helps, here is what I implemented to create mask polygons:

A problem for your solution would be when the image has more than one mask. To solve this I made a simple modification:您的解决方案的一个问题是当图像具有多个蒙版时。为了解决这个问题，我做了一个简单的修改：
mask = cv2.cvtColor(mask_original, cv2.COLOR_BGR2GRAY)

edged = cv2.Canny(mask, 100, 200)

contours, _ = cv2.findContours(cv2.Canny(mask, 100, 200), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

@clebemachado Thank you very much for this solution, it saved my day. :heart:

glp-92 commented 4 months ago

Hello,

Based on the ryouchinsa response which discusses a possible solution for labeling "donut" type objects, I have used part of his code and created a couple of functions to export binary masks to YOLO txt and vice versa. The process of converting masks to YOLO coordinates in this script can take a considerable amount of time depending on the number of objects and contours detected. It may not be the best possible optimization, but it has worked decently for my use case, where I plan to do slicing for training and then detecting small objects.

Mask to txt with yolo coords

import cv2
import numpy as np

def contours_join(parent_contour, child_contour):
    """
    Join parent contour with child contour
    Donut use case. Inside donut shouldn't detect anything
    """
    def is_clockwise(contour):
        value = 0
        num = len(contour)
        for i in range(len(contour)):
            p1 = contour[i]
            if i < num - 1:
                p2 = contour[i + 1]
            else:
                p2 = contour[0]
            value += (p2[0][0] - p1[0][0]) * (p2[0][1] + p1[0][1])
        return value < 0

    def get_merge_point_idx(contour1, contour2):
        idx1 = 0
        idx2 = 0
        distance_min = -1
        for i, p1 in enumerate(contour1):
            for j, p2 in enumerate(contour2):
                distance = pow(p2[0][0] - p1[0][0], 2) + pow(p2[0][1] - p1[0][1], 2)
                if distance_min < 0:
                    distance_min = distance
                    idx1 = i
                    idx2 = j
                elif distance < distance_min:
                    distance_min = distance
                    idx1 = i
                    idx2 = j
        return idx1, idx2

    def merge_contours(contour1, contour2, idx1, idx2):
        contour = []
        for i in list(range(0, idx1 + 1)):
            contour.append(contour1[i])
        for i in list(range(idx2, len(contour2))):
            contour.append(contour2[i])
        for i in list(range(0, idx2 + 1)):
            contour.append(contour2[i])
        for i in list(range(idx1, len(contour1))):
            contour.append(contour1[i])
        contour = np.array(contour, dtype=np.int32)
        return contour

    def merge_with_parent(parent_contour, contour):
        if not is_clockwise(parent_contour):
            parent_contour = parent_contour[::-1]
        if is_clockwise(contour):
            contour = contour[::-1]
        idx1, idx2 = get_merge_point_idx(parent_contour, contour)
        return merge_contours(parent_contour, contour, idx1, idx2)

    return merge_with_parent(parent_contour=parent_contour, contour=child_contour)

def group_child_contours_with_parent(hierarchy):
    """
    returns:
        {
            parent_key: {
                "parent": parent_key,
                "child": [child_keys]
            }
        }
    """
    groups = {}
    for i, h in enumerate(hierarchy.squeeze()):
        parent_index = h[3]
        if parent_index != -1:
            if groups.get(parent_index) is not None:
                groups[parent_index]["child"].append(i)
            else:
                groups[parent_index] = {"parent": parent_index, "child": [i]}
        else:
            if groups.get(i) is not None:
                groups[i]["parent"] = i
            else:
                groups[i] = {"parent": i, "child": []}
    return groups

def convert_mask_to_yolo_seg_label(mask_path):
    label_str, test_mask = "", None
    mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
    height, width = mask.shape
    _, thresh = cv2.threshold(mask, 127, 255, 0)
    contours, hierarchy = cv2.findContours(thresh, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
    if not contours:
        return label_str, test_mask
    test_mask = np.zeros((height, width), dtype=np.uint8) # Test mask
    if hierarchy.shape[1] > 1: # More than 1 contour
        contour_groups = group_child_contours_with_parent(hierarchy)
        for contour_group in contour_groups.values():
            contour_label = ""
            parent_contour = contours[contour_group["parent"]]
            for child in contour_group["child"]:
                parent_contour = contours_join(parent_contour=parent_contour, child_contour=contours[child])
            contour_to_write = parent_contour.squeeze() # Coords in [[x, y]], [[x, y]] must remove 1 axis
            contour_to_write_list = contour_to_write.tolist()
            if len(contour_to_write_list) < 3: 
                continue # Need at least 3 points to create the contour
            contour_to_write_list = filter(lambda c: isinstance(c, list), contour_to_write_list) # Filter, if contour not a list (only one point)
            for point in contour_to_write_list:
                contour_label += f" {round(float(point[0]) / float(width), 6)}"
                contour_label += f" {round(float(point[1])/ float(height), 6)}"
            if contour_label:
                label_str += f"0 {contour_label}\n"
            parent_contour = np.expand_dims(parent_contour, axis=0)
            cv2.drawContours(test_mask, parent_contour, -1, 255, -1)
    else:
        contour_label = ""
        contour_to_write = (contours[0].squeeze())
        contour_to_write_list = contour_to_write.tolist()
        if len(contour_to_write_list) < 3:
            return label_str, test_mask
        contour_to_write_list = filter(lambda c: isinstance(c, list), contour_to_write_list) # Filter, if contour not a list
        for point in contour_to_write_list:
            contour_label += f" {round(float(point[0]) / float(width), 6)}"
            contour_label += f" {round(float(point[1])/ float(height), 6)}"
        if contour_label:
            label_str += f"0 {contour_label}\n"
        parent_contour = np.expand_dims(contour_to_write, axis=0)
        cv2.drawContours(test_mask, parent_contour, -1, 255, -1)
    label_str = label_str.rstrip()  # Remove last \n
    return label_str, test_mask

label_str, mask_generated = convert_mask_to_yolo_seg_label(mask_path='mask.png')
cv2.imwrite("test_mask.png", mask_generated)
with open('mask.txt', "w") as f:
    f.write(label_str)

Yolo txt label to mask

import numpy as np
import cv2

def convert_yolo_label_to_mask(original_image_file_path, label_file_path):
    original_image = cv2.imread(original_image_file_path, cv2.IMREAD_GRAYSCALE)
    h, w = original_image.shape
    mask = np.zeros((h, w), np.uint8)

    with open(label_file_path, 'r') as f:
        for line in map(lambda x: x.rsplit(), f.readlines()):
            x_points = list(map(lambda x: int(float(x) * w), line[1::2]))
            y_points = list(map(lambda y: int(float(y) * h), line[2::2]))
            pts = np.array(list(zip(x_points, y_points)),np.int32).reshape((-1, 1, 2))
            # cv2.polylines(mask, [pts], True, 255, 1) # Not filling
            cv2.fillPoly(mask, [pts], 255)
    return mask

im = convert_yolo_label_to_mask('1_040422.jpg', '1_040422.txt')
cv2.imwrite("mask.png", im)

Lastly, I want to thank the people at Ultralytics and everyone who collaborates for all their effort. Best regards.

glenn-jocher commented 4 months ago

Hello @glp-92,

Thank you for sharing your detailed solution for converting binary masks to YOLO format annotations and vice versa. Your approach for handling "donut" type objects and merging contours is insightful and should be very helpful for others facing similar challenges.

For those looking to implement this, here are a few additional tips:

Performance Optimization: If the conversion process is taking considerable time, consider optimizing the contour detection and merging steps. Libraries like scikit-image offer efficient methods for handling image processing tasks.
Parallel Processing: For large datasets, you might benefit from parallel processing. Using Python's multiprocessing library can help speed up the conversion process by distributing the workload across multiple CPU cores.
Validation: Always validate the generated YOLO annotations by visualizing them on the original images. This can help ensure that the conversion process is accurate and that the annotations align correctly with the objects in the images.

Here's a brief example of how you might parallelize the mask-to-YOLO conversion process:

import multiprocessing as mp
import os

def process_mask(mask_file):
    label_str, mask_generated = convert_mask_to_yolo_seg_label(mask_path=mask_file)
    output_label_path = os.path.splitext(mask_file)[0] + '.txt'
    cv2.imwrite(os.path.splitext(mask_file)[0] + "_test_mask.png", mask_generated)
    with open(output_label_path, "w") as f:
        f.write(label_str)

if __name__ == "__main__":
    mask_files = [os.path.join('path/to/masks', f) for f in os.listdir('path/to/masks') if f.endswith('.tiff')]
    with mp.Pool(mp.cpu_count()) as pool:
        pool.map(process_mask, mask_files)

This script will distribute the mask processing tasks across all available CPU cores, potentially reducing the overall processing time.

Thank you again for your contribution, and if you have any further questions or need additional assistance, feel free to ask!

userwatch commented 4 months ago

Hello @glp-92 Yolo txt label to mask The conversion code in this section was very useful to me. Thank you very much. I appreciate your sharing.

Best wishes,

ultralytics / ultralytics