Closed atmilatos closed 10 months ago
π Hello @atmilatos, thank you for your interest in YOLOv8 π! We recommend a visit to the YOLOv8 Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.
If this is a π Bug Report, please provide a minimum reproducible example to help us debug it.
If this is a custom training β Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.
Join the vibrant Ultralytics Discord π§ community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.
Pip install the ultralytics
package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics
YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.
Here is another, clearer, example. The bottom insulator has the correct bb, YOLO has cropped the correct image part, but its mask is "masked" by another object, hence it's not returned.
@atmilatos box and mask predictions are completely separate, they may or may not align. The box primarily serves to crop mask predictions.
In the case of overlapping objects prediction results in general may perform more poorly.
If you are interested in improved masks at the expense of some speed you can use retina_masks=True
Thank you for the reply (and sorry for the post label error).
I already use retina masks.
It is strange that the image produced by predict has all the mask pixels but the actual returned mask object does not.
Do you think that if I passed each bounding box overlaid at its correct coordinates over a blank image again into predict that I could get the "actual" masks?
@atmilatos generally, box predictions and mask predictions are dissociated processes in YOLOv8, which means they might not always align. The bounding box primarily functions to crop the mask predictions. However, when you have overlapping objects, predictions might perform more poorly.
You mentioned that predict
produces an image with all the mask pixels even though the returned mask object doesn't. This discrepancy is because the visualization process uses the thresholded mask on the crop defined by the bounding box. The crop part is supposed to have all the mask pixels.
About predicting on each bounding box overlaid onto a blank image: this could potentially work in theory, especially if the objects are easily separable and within the boundaries of the image. But it's not guaranteed to improve the results, because the nature of how YOLOv8 performs detection, which is highly dependent on the context of the surrounding image area.
These are inherent limitations in YOLOv8 and similar architectures, and while we continue to improve these aspects in ongoing research, some of these issues might still be present. We appreciate your understanding and patience.
@glenn-jocher thank you for the response.
I have been exploring the possibility to run predict and get the masks, then isolate each of them (via blurring/covering the rest of the masks), then re-running the prediction, and finally combining the masks for each object. So far the results are promising but I have to validate more thoroughly.
@atmilatos, it's great to hear you're exploring innovative workarounds! The iterative approach you've described sounds cleverβit essentially involves masking out detected objects and re-running detection for possibly occluded items. It aligns with some strategies used in instance segmentation tasks to handle occlusions.
There are, however, a few considerations for this method:
If your results are promising, it might be indeed a viable strategy for your use case. Keep in mind that edge cases and variations in object appearances under different conditions could impact the effectiveness of this method. Continuous validation over diverse datasets will be key.
We are always delighted to see community members push the limits of what's possible and find new ways to optimize their workflows. Your feedback and experience could be very valuable for others facing similar challenges. Keep up the excellent work!
π Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.
For additional resources and information, please see the links below:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLO π and Vision AI β
@atmilatos box and mask predictions are completely separate, they may or may not align. The box primarily serves to crop mask predictions.
In the case of overlapping objects prediction results in general may perform more poorly.
If you are interested in improved masks at the expense of some speed you can use
retina_masks=True
Hi @glenn-jocher, if I don't want the box to crop mask predictions, what should I do?
Hi @WLi0777, if you prefer not to have the bounding box crop the mask predictions, you might consider adjusting the confidence and IoU thresholds to fine-tune the predictions. However, the current architecture of YOLOv8 inherently uses bounding boxes to crop masks. For now, there isn't a built-in option to disable this behavior. Your approach of re-running predictions on isolated regions is a creative solution, and we encourage you to continue exploring such methods.
Search before asking
YOLOv8 Component
Predict
Bug
Hello,
I am using YOLOv8 for segmentation purposes. I have found that in some cases the resulting bounding box is larger than the corresponding mask. The bounding box result is the correct one, while the mask is missing some data. I have validated this by looking at the saved image after the prediction, as well as the crops written by predict. The mask polygons (xyxy) also correspond to the (incorrect) masks.
I have written all three in per-mask images, and I have attached one of them.
Is there something I am missing regarding the relationship between the mask and the bounding box?
Thank you in advance.
Environment
Ultralytics YOLOv8.0.157 Python-3.11.6 torch-2.1.0.dev20230722+cu121 CUDA:0 (NVIDIA GeForce RTX 4090, 24564MiB)
Minimal Reproducible Example
Here is a code sample that produces the error.
Additional
No response
Are you willing to submit a PR?