ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
51.15k stars 16.43k forks source link

how to get polygon mask in yolov5 #12701

Closed Snehagupta1907 closed 8 months ago

Snehagupta1907 commented 9 months ago

Search before asking

Question

Hello , I have trained custom dataset on yolov5 segment model....now i wanted to get the exact polygon boundary. I have annotated the dataset(pictires of oranges) using roboflow and i want to estimated the size the so for that needed the boundary polygon.....

in yolov8 we can directly use results = model(image) result = results[0] print(result.masks)

output is- ultralytics.engine.results.Boxes object with attributes:

cls: tensor([0.]) conf: tensor([0.8229]) data: tensor([[171.2862, 73.5749, 457.6823, 350.5975, 0.8229, 0.0000]]) id: None is_track: False orig_shape: (447, 640) shape: torch.Size([1, 6]) xywh: tensor([[314.4843, 212.0862, 286.3962, 277.0226]]) xywhn: tensor([[0.4914, 0.4745, 0.4475, 0.6197]]) xyxy: tensor([[171.2862, 73.5749, 457.6823, 350.5975]]) xyxyn: tensor([[0.2676, 0.1646, 0.7151, 0.7843]])

how to do this in yolov5

Additional

No response

github-actions[bot] commented 9 months ago

👋 Hello @Snehagupta1907, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Requirements

Python>=3.8.0 with all requirements.txt installed including PyTorch>=1.8. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

YOLOv5 CI

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

Introducing YOLOv8 🚀

We're excited to announce the launch of our latest state-of-the-art (SOTA) object detection model for 2023 - YOLOv8 🚀!

Designed to be fast, accurate, and easy to use, YOLOv8 is an ideal choice for a wide range of object detection, image segmentation and image classification tasks. With YOLOv8, you'll be able to quickly and accurately detect objects in real-time, streamline your workflows, and achieve new levels of accuracy in your projects.

Check out our YOLOv8 Docs for details and get started with:

pip install ultralytics
glenn-jocher commented 9 months ago

@Snehagupta1907 hello! In YOLOv5, the segmentation model provides a bitmap mask rather than a polygon. However, you can post-process the bitmap mask to approximate a polygon boundary. You can use OpenCV or similar libraries to find contours on the mask, which can then be simplified to a polygon using functions like cv2.approxPolyDP.

Here's a rough outline of the steps you might take after you have your model's predictions:

  1. Get the bitmap mask from the model's output.
  2. Use OpenCV's findContours to detect contours on the mask.
  3. Simplify the contour to a polygon with approxPolyDP.

For more details on how to work with the outputs of YOLOv5 and post-processing techniques, please refer to our documentation at https://docs.ultralytics.com/yolov5/.

Keep in mind that the exact implementation will depend on your specific requirements and the programming environment you're using. Good luck with your size estimation project! 🍊😊

Snehagupta1907 commented 9 months ago

@glenn-jocher Hi thanks for the quick response. I am not sure what you mean. where can i make this change? In the predict.py file under the segment folder?

Snehagupta1907 commented 9 months ago

From one of you response in previous question binary_mask = cv2.threshold(predicted_mask, 0.5, 1, cv2.THRESH_BINARY)[1]

where can i get this predicted_mask...cause on runnning predict.py in output i am just getting img with bounding box and mask over it.....i m not interested in bounding box but more like interested in getting output like .....black background with white pixel mask showing shape of object and then getting the predicted polygyon boundary

glenn-jocher commented 9 months ago

@Snehagupta1907 apologies for any confusion. The predicted_mask is the output you get from the segmentation model, which is a grayscale image where pixel values represent the likelihood of belonging to a certain class. To extract this mask, you would typically access the model's outputs after running inference.

Here's a simplified example of how you might modify the predict.py or your inference script:

  1. Run your model to get the predictions.
  2. Extract the mask from the predictions. This will be a 2D array if you have a single class or a 3D array with an axis for each class.
  3. Apply a threshold to binarize the mask.
  4. Use OpenCV functions to find contours and approximate the polygon.
import cv2
import numpy as np

# Assuming 'model' is your loaded YOLOv5 model and 'img' is your input image
results = model(img)

# Extract the mask for the class of interest, here index '0' for the first class
predicted_mask = results.pred[0][..., 0].sigmoid().cpu().numpy()  # This might vary based on your model's output

# Threshold the mask to get a binary image
binary_mask = cv2.threshold(predicted_mask, 0.5, 1, cv2.THRESH_BINARY)[1]

# Find contours on the binary mask
contours, _ = cv2.findContours(binary_mask.astype(np.uint8), cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

# Approximate contours to polygons
polygons = [cv2.approxPolyDP(contour, epsilon, True) for contour in contours]

# Now 'polygons' contains the approximate polygon boundaries

Please adjust the code above to fit the exact shape and format of your model's output. The epsilon parameter in cv2.approxPolyDP controls the approximation accuracy; a smaller value will result in a polygon closer to the original contour.

Remember to check the YOLOv5 documentation for more details on handling model outputs and post-processing. Good luck with your project! 🌟

Snehagupta1907 commented 9 months ago

I did this but getting error in --> 17 results = model(img) 18 19 fig, ax = plt.subplots(figsize=(16, 12))

2 frames /content/yolov5/models/common.py in forward(self, im, augment, visualize) 542 def forward(self, im, augment=False, visualize=False): 543 # YOLOv5 MultiBackend inference --> 544 b, ch, h, w = im.shape # batch, channel, height, width 545 if self.fp16 and im.dtype != torch.float16: 546 im = im.half() # to FP16

AttributeError: 'str' object has no attribute 'shape'

Snehagupta1907 commented 9 months ago

Using cache found in /root/.cache/torch/hub/ultralytics_yolov5_master YOLOv5 🚀 v7.0-283-g875d9278 Python-3.10.12 torch-2.1.0+cu121 CUDA:0 (Tesla T4, 15102MiB)

Fusing layers... Model summary: 165 layers, 7406513 parameters, 0 gradients, 25.7 GFLOPs WARNING ⚠️ YOLOv5 SegmentationModel is not yet AutoShape compatible. You will not be able to run inference with this model.

glenn-jocher commented 9 months ago

@Snehagupta1907 it appears that you're trying to use the YOLOv5 SegmentationModel, which as of the warning message you received, is not yet compatible with AutoShape. This means you cannot directly pass an image path (string) to the model for inference as you would with the standard YOLOv5 detection models.

Instead, you'll need to manually handle the image preprocessing steps before passing the image tensor to the model. Here's a general outline of what you need to do:

  1. Load the image using a library like OpenCV.
  2. Preprocess the image to match the input format expected by the model (e.g., resize, normalize).
  3. Convert the image to a PyTorch tensor.
  4. Add a batch dimension to the tensor.
  5. Pass the tensor to the model for inference.

Here's a code snippet to guide you through these steps:

import cv2
import torch
from torchvision.transforms import functional as F

# Load your image
img_path = 'path_to_your_image.jpg'  # Replace with your image path
image = cv2.imread(img_path)  # Read image with OpenCV
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)  # Convert BGR to RGB

# Preprocess the image
image = F.to_tensor(image)  # Convert to PyTorch tensor
image = image.unsqueeze(0)  # Add batch dimension

# Load your model (assuming 'model' is already loaded)
# Make sure the model is in evaluation mode
model.eval()

# Disable gradients for inference
with torch.no_grad():
    results = model(image)  # Run inference

# Continue with the post-processing steps from the previous messages

Make sure to replace 'path_to_your_image.jpg' with the actual path to your image. The preprocessing steps (resizing, normalization) should match the preprocessing used during model training.

After you have the results, you can proceed with the thresholding and contour detection as previously described. Remember that the exact preprocessing steps and how you access the segmentation mask from results may vary based on your specific model and setup.

Snehagupta1907 commented 9 months ago

on running this results(image) -RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 64 but got size 63 for tensor number 1 in the list.

Snehagupta1907 commented 9 months ago

Input tensor shape: torch.Size([1, 3, 500, 500])

glenn-jocher commented 9 months ago

@Snehagupta1907 the error you're encountering suggests that there's a mismatch in the expected input size for one of the layers within the model. YOLOv5 models typically expect the height and width of the input image to be multiples of a specific stride (commonly 32 or 64). If your input image size is 500x500, it's not a multiple of the model's stride, which is likely causing the issue.

To resolve this, you should resize your image to dimensions that are multiples of the model's stride. Here's how you can modify the preprocessing to ensure the input tensor has the correct shape:

import cv2
import torch
from torchvision.transforms import functional as F

# Load your image
img_path = 'path_to_your_image.jpg'  # Replace with your image path
image = cv2.imread(img_path)  # Read image with OpenCV
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)  # Convert BGR to RGB

# Resize image to be a multiple of the model's stride (e.g., 32 or 64)
stride = 32  # Replace with the stride of your model
new_size = (image.shape[1] - image.shape[1] % stride, image.shape[0] - image.shape[0] % stride)
image = cv2.resize(image, new_size)

# Preprocess the image
image = F.to_tensor(image)  # Convert to PyTorch tensor
image = image.unsqueeze(0)  # Add batch dimension

# Load your model (assuming 'model' is already loaded)
# Make sure the model is in evaluation mode
model.eval()

# Disable gradients for inference
with torch.no_grad():
    results = model(image)  # Run inference

# Continue with the post-processing steps from the previous messages

In this snippet, new_size is calculated to ensure that the width and height are multiples of the model's stride. Make sure to replace stride with the actual stride used by your model. After resizing and preprocessing the image, you should be able to pass it through the model without encountering the size mismatch error.

If you're not sure about the stride of your model, you can check the configuration file used during training or the model's architecture details to find the correct stride value.

github-actions[bot] commented 8 months ago

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐