ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
51.07k stars 16.42k forks source link

How to generate my yolov5-seg segmentation result(mask) as a png file? #10637

Closed Slayer-swjtu closed 1 year ago

Slayer-swjtu commented 1 year ago

Search before asking

Question

Thanks for your work. I ran segment/predict.py and got a predict png with mask and detect box. I want to generate a binary image to show my predict result with my mask separated. What should i do? I just tried --save-txt command but can not understand how to deal with those data.

Additional

No response

github-actions[bot] commented 1 year ago

👋 Hello @Slayer-swjtu, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://ultralytics.com or email support@ultralytics.com.

Requirements

Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

YOLOv5 CI

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

iura77 commented 1 year ago

Hi all!

Have the same question!

Slayer-swjtu commented 1 year ago

Here are two solutions i thought: My method: I used the data in the txt files . Using cv2.fillPoly() function to generate my interest area on a white board. An easier method: You can hide the detect label, cancel the detect box and use cv::Mat: auto mask = result_png - ori_png;

iura77 commented 1 year ago

I am using cv2.drawContours() right now but I think it is not the easiest way how yolo should work. IE in pytorch deeplabv3_resnet150 it is almost out of the box option.

IamWu555 commented 1 year ago

Hi, did you solve the problem with a better solution?

glenn-jocher commented 1 year ago

@IamWu555 hello, thank you for reaching out!

To create a binary image of your predicted mask, you can try the following approach:

  1. First, threshold the predicted mask to create a binary image:
binary_mask = cv2.threshold(predicted_mask, 0.5, 1, cv2.THRESH_BINARY)[1]

Here, a threshold value of 0.5 is used. Pixels with values above 0.5 will be set to 1 and those below 0.5 will be set to 0.

  1. Next, multiply this binary mask with the original image to show only the pixels of interest:
binary_image = binary_mask * original_image

This will create an image where the predicted mask is separated from the original image, with the rest of the image blacked out.

I hope this solution works for you! Please let me know if you have any additional questions or concerns.

devendraswamy commented 1 year ago

@glenn-jocher @IamWu555 can anyone help out for explaining about what is "predicted_mask" , I hope "predicted_mask" is an bbox which output from the model corresponding to respective mask.

glenn-jocher commented 1 year ago

@devendraswamy, thank you for your question!

In YOLOv5, "predicted_mask" refers to the segmentation mask generated by the model for each predicted object bounding box. The model predicts both the bounding box coordinates and a corresponding segmentation mask for each detected object.

The mask represents the pixel-level segmentation of the object, where each pixel is assigned a value (usually between 0 and 1) indicating the likelihood of that pixel belonging to the object. This predicted mask can be used to separate and extract the object of interest from the original image.

To obtain the predicted mask, you can use the output of the segmentation model in YOLOv5 and apply a binary threshold (e.g., using cv2.threshold) to convert the mask into a binary image, where pixels above a certain confidence threshold are considered part of the object and pixels below the threshold are considered background.

I hope this explanation clarifies the concept of the "predicted_mask". Let me know if you have any further questions or need more assistance!

ziyuuuuuu commented 1 year ago

Hi i have to raise hand at here for helping. I have used this way to binarize my yolov5 predicted pics but after the binarization, my picture turned out a pure black, here is my code snip: import cv2 import numpy as np image_path='path to my predicted image' image = cv2.imread(image_path) mask_color = [255, 56, 56] # FF3838 in RGB binary_mask = np.all(image == mask_color, axis=-1).astype(np.uint8) * 255 thresholded_mask = cv2.threshold(binary_mask, 127, 255, cv2.THRESH_BINARY)[1] ori_image_path =original image without prediction' original_image = cv2.imread(ori_image_path) masked_image = cv2.bitwise_and(original_image, original_image, mask=thresholded_mask) cv2.imwrite('masked_output_image.png', masked_image) from google.colab.patches import cv2_imshow cv2_imshow(masked_image) tahnk you for helping!

ziyuuuuuu commented 1 year ago

If I need to first remove the bounding box and just keep the mask, Maybe it could help, or it is not necessary. Because i have used another code(shown below): import cv2 import numpy as np image_path='my image path' image = cv2.imread(image_path) mask_color = (int('38', 16), int('38', 16), int('FF', 16)) binary_mask = np.all(image == mask_color, axis=-1).astype(np.uint8) cv2.imwrite('binary_mask.jpg', binary_mask * 255) from google.colab.patches import cv2_imshow cv2_imshow(binary_mask * 255) and the result is shown below: error It seems like just the bounding box area is binarized but not the real mask area.

glenn-jocher commented 1 year ago

@ziyuuuuuu to remove the bounding box and keep only the segmentation mask in the YOLOv5 output, you can follow these steps:

  1. First, convert the color image to grayscale:
grayscale_img = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
  1. Next, apply a binary threshold to create a binary mask:
_, binary_mask = cv2.threshold(grayscale_img, 1, 255, cv2.THRESH_BINARY)

Here, we use a threshold value of 1 to separate the object from the background. You can adjust this threshold value based on your specific requirements.

  1. Finally, you can apply this binary mask to the original image to visualize the mask area:
masked_image = cv2.bitwise_and(image, image, mask=binary_mask)

The resulting masked_image will show only the pixels that correspond to the segmentation mask while the rest will be blacked out.

By following these steps, you should be able to remove the bounding box and generate a binary image that represents the actual mask area. Let me know if you have any further questions!

iura77 commented 1 year ago

Hi i have to raise hand at here for helping. I have used this way to binarize my yolov5 predicted pics but after the binarization, my picture turned out a pure black

Hi ziyuuuuuu. Nice avatar!

Can explain how I have resolved this problem. First of all I have used yolo segmentation. After images were segmented, I read the segmentation coordinates and built the mask using the segmentation points. Here is my code:

import numpy as np
import cv2 as cv
import os
import shutil
from  matplotlib import pyplot as plt

def read_jpg(jpg):
## Reading the image and taking the shape of it. 
    img = cv.imread(r'/folder with images/{jpg}')
    height = img.shape[0]
    width = img.shape[1]

    pts=[] ## empty list for mask

    jpg_num=jpg.split(".")[0] ## splitting the name of the image to make filename of the detected label
    with open(r'/yolo folder/runs/predict-seg/exp/labels/{jpg_num}.txt') as txt:
        txt_lines = txt.readlines()
        for line in txt_lines:
            try:
                line = line.split(" ")
                line=list(filter(lambda i: "." in i,line))
                pts_part = [] ## an empty list for partial mask
                for i in range(0, len(line), 2):
                    pts_part.append([int(float(line[i]) * width), int(float(line[i + 1]) * height)]) ## adding points from detection to partial list
                pts.append(pts_part) ## adding the partial list of points to the main list
            except:
                # if odd number of points
                pass
    return pts,img ##returning the image and the list of points for mask

## list of images
jpgs=os.listdir(r"/folder with images/")

for jpg in jpgs:
    pts,img=read_jpg(jpg)
    ## making array of arrays
    q = [np.asarray(n) for n in pts]
    pts = np.array(q) 

    ## making the mask
    mask = np.zeros(img.shape[:2], np.uint8)
    cv.drawContours(mask, pts, -1, (255, 255, 255), -1, cv.LINE_AA)

    ## applying the mask
    dst = cv.bitwise_and(img, img, mask=mask)

    ## write masked image in a folder
    cv.imwrite(os.path.join(r'/new folder/', jpg), dst)

    ## write the mask in a separate image
    cv.imwrite(os.path.join(r'/mask folder/', jpg), mask)
glenn-jocher commented 1 year ago

@iura77 hi ziyuuuuuu!

To generate binary masks from YOLOv5 predicted images, you can try the following approach:

  1. Read the prediction coordinates from the YOLOv5 labels. For each image, you can parse the corresponding .txt file to extract the coordinates of the detected objects.

  2. Build the mask. Use the coordinates to create a mask using OpenCV's cv2.drawContours function. This function allows you to draw the contours of each detected object on a blank image, effectively creating a mask.

  3. Apply the mask to the original image. Use the cv2.bitwise_and function to apply the mask to the original image. This will give you the masked image, where the masked area is preserved, and the rest is blacked out.

  4. Save the masked image. You can save the masked image to a separate folder using cv2.imwrite.

  5. Save the mask separately. If you want to save the mask as a separate image, you can do so by using cv2.imwrite and saving it to another folder.

You can find an example code snippet in my previous response that demonstrates how to implement the above steps. Make sure you adjust the file paths to match the locations of your images and labels.

I hope this helps! Let me know if you have any further questions.

ziyuuuuuu commented 1 year ago

Hi Thank you for the answer, but it seems like after turning my image into grayscale, it became harder to remove the bounding box from my image. I realized that I hadn't explained my purpose for doing this work. My goal is to remove the bounding box, then binarize the mask area into white and the other area into black, or vice versa. I have tried the code here, and have played around with the threshold, but it seems like the bounding box is quite'strong' so it is still there. if it is possible, we will remove the bounding box before turning the picture gray? I have also tried contours, but it doesnt help so much..here is the code i use `grayscale_img= cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) from google.colab.patches import cv2_imshow cv2_imshow(grayscale_img)

Thanks in advance

iura77 commented 1 year ago

@ziyuuuuuu

Are you sure you are using segmentation and NOT object detection?

glenn-jocher commented 1 year ago

@iura77

Based on your code and the images you provided, it seems like you are actually performing object detection rather than segmentation. In YOLOv5, the bounding box you are seeing is a fundamental part of object detection, and removing it completely while preserving the mask area can be challenging.

If your goal is to generate a binary mask with the object area as white and the background as black, you can try the following approach:

  1. Convert the original image to grayscale: grayscale_img = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

  2. Apply a binary threshold to create a binary mask: _, binary_mask = cv2.threshold(grayscale_img, 80, 255, cv2.THRESH_BINARY)

  3. To remove the bounding box from the binary mask, you can consider using image processing techniques such as morphological operations (e.g., dilation or erosion) or contour detection to refine the mask and eliminate the bounding box.

  4. If the bounding box is not completely removed, you might need to experiment with different techniques and parameters to achieve the desired result. Thresholding alone may not be sufficient to remove the bounding box entirely.

Keep in mind that complete removal of the bounding box while preserving the mask area can be challenging, especially if the bounding box is "strong" and overlaps with the mask region. You might need more advanced techniques or models specifically designed for segmentation tasks to obtain accurate results.

I hope this explanation clarifies the challenge you are facing and provides some guidance on how to approach it. Let me know if you have any further questions or need additional assistance!

iura77 commented 1 year ago

@glenn-jocher sorry bro but this time you've missed it.

@ziyuuuuuu could you please share with us your command which starts predictions? Like:

python ./detect.py ......

or

python ./segment/predict.py ....

ziyuuuuuu commented 1 year ago

Hi i am pretty sure that i used the segmentation model for yolov5 and here is the training part code !python segment/train.py and prediction part code !python segment/predict.py I think, as @glenn-jocher said, it might be totally challenging to remove the bounding box part, especially in my case where the segemnted area is connected to the bounding box.

ziyuuuuuu commented 1 year ago

@glenn-jocher @iura77 Based on your code and the images you provided, it seems like you are actually performing object detection rather than segmentation. In YOLOv5, the bounding box you are seeing is a fundamental part of object detection, and removing it completely while preserving the mask area can be challenging.

If your goal is to generate a binary mask with the object area as white and the background as black, you can try the following approach:

  1. Convert the original image to grayscale: grayscale_img = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
  2. Apply a binary threshold to create a binary mask: _, binary_mask = cv2.threshold(grayscale_img, 80, 255, cv2.THRESH_BINARY)
  3. To remove the bounding box from the binary mask, you can consider using image processing techniques such as morphological operations (e.g., dilation or erosion) or contour detection to refine the mask and eliminate the bounding box.
  4. If the bounding box is not completely removed, you might need to experiment with different techniques and parameters to achieve the desired result. Thresholding alone may not be sufficient to remove the bounding box entirely.

Keep in mind that complete removal of the bounding box while preserving the mask area can be challenging, especially if the bounding box is "strong" and overlaps with the mask region. You might need more advanced techniques or models specifically designed for segmentation tasks to obtain accurate results.

I hope this explanation clarifies the challenge you are facing and provides some guidance on how to approach it. Let me know if you have any further questions or need additional assistance!

Thank you for the quick reply and helping. I probably will try to use erosion and contour again to remove as much as possible since it is quite important for me to keep just the segmented part for the next process. Thank you so much for both of you!

iura77 commented 1 year ago

Hi i am pretty sure that i used the segmentation model for yolov5 and here is the training part code !python segment/train.py and prediction part code !python segment/predict.py I think, as @glenn-jocher said, it might be totally challenging to remove the bounding box part, especially in my case where the segemnted area is connected to the bounding box.

You don't have to remove bounding boxes as long as you have original images. Just add "--save-txt" in your predict.py command and look in the "labels" folder in the folder with results. Can you please show us one of the txt files you will find?

glenn-jocher commented 1 year ago

@ziyuuuuuu hello,

You can generate the predicted bounding box coordinates in YOLOv5 by adding the "--save-txt" flag to your predict.py command. This will save the predicted coordinates in a ".txt" file located in the "labels" folder within the results folder.

To help you further, can you please provide an example of one of the ".txt" files you found in the "labels" folder? This way, we can assist you better in understanding the format and contents of these files.

Thank you.

ziyuuuuuu commented 1 year ago

Hi Guys @glenn-jocher @iura77 Thank you for the quick reply! 0 0.865625 0.9875 0.8625 0.990625 0.8625 0.996875 0.91875 0.996875 0.91875 0.990625 0.9125 0.990625 0.909375 0.9875 here is one line from the label file i generated.

iura77 commented 1 year ago

@glenn-jocher thanks man.

@ziyuuuuuu

Look, the line format is: <class-index> <x1> <y1> <x2> <y2> ... <xn> <yn> We need pairs of coordinates to build the contour. Here is the code for making list of coordinates:

pts_part = []
line = line.split(" ")
for i in range(1, len(line), 2):
                    pts_part.append([int(float(line[i]) * width), int(float(line[i + 1]) * height)])

You can be scared of this expression, but in fact it's very simple: int(float(line[i]) width) As you can see your coordinates are not in pixels, but just a float number. They are normalized between 0 and 1. To find out the coordinates in pixels we need to multiply normalized coordinate to width or height. It is done here: float(line[i]) width

I used float to convert the value from string to numerical type. And as there can't be float pixel coordinate - it is converted in int from float. In other words, the code does this: pts_part.append(xi, yi)

After you have added all the coordinates in a list of lists, you can create the initial mask by generating an array in the shape of original image and fill it with zeroes, which usually displayed as black: mask = np.zeros(img.shape[:2], np.uint8)

All you have to do after that is to draw the contour and fill it with 255 (white): cv.drawContours(mask, pts, -1, (255, 255, 255), -1, cv.LINE_AA)

And finally take the original image without bounding boxes apply the mask and save the new image: dst = cv.bitwise_and(img, img, mask=mask) cv.imwrite(os.path.join(r'/new folder/', jpg), dst)

Please feel free to ask any questions.

glenn-jocher commented 1 year ago

@ziyuuuuuu

The line you provided from the label file represents the predicted bounding box coordinates. To create a binary mask from these coordinates, you can follow the steps below:

  1. Parse the line by splitting it using a space delimiter:

    line = line.split(" ")
  2. Iterate through the line starting from index 1 (since the class index is at index 0) and create a list of coordinate pairs:

    pts_part = []
    for i in range(1, len(line), 2):
    pts_part.append([int(float(line[i]) * width), int(float(line[i + 1]) * height)])

    Here, width and height represent the dimensions of the original image.

  3. Create an initial mask array with zeros, matching the shape of the original image:

    mask = np.zeros(img.shape[:2], np.uint8)
  4. Draw the contour using the coordinate pairs and fill it with 255 (white) in the mask array:

    cv2.drawContours(mask, [np.array(pts_part)], -1, (255, 255, 255), -1, cv2.LINE_AA)
  5. Finally, apply the mask to the original image using the bitwise AND operation and save the resulting image:

    dst = cv2.bitwise_and(img, img, mask=mask)
    cv2.imwrite(os.path.join('/new folder/', jpg), dst)

    Replace /new folder/ with the desired output folder path.

If you have any further questions or need clarification, please feel free to ask.

ziyuuuuuu commented 1 year ago

Thank you so much, It works now, and i realize i can use cv2.fillPoly(mask, [pts_part], color=(255)) also to draw the mask area. Thank you for both of you helping!

glenn-jocher commented 1 year ago

@ziyuuuuuu hi [Issue Poster],

I'm glad to hear that the solution worked for you! Using cv2.fillPoly(mask, [pts_part], color=(255)) is indeed another way to draw the mask area. Thank you for sharing that alternative approach.

If you have any further questions or need assistance with anything else, please don't hesitate to ask.

Thank you for using YOLOv5 and for being a part of the community!

Best regards,