ultralytics / ultralytics

Ultralytics YOLO11 ๐Ÿš€
https://docs.ultralytics.com
GNU Affero General Public License v3.0
30.69k stars 5.93k forks source link

how to deal with object detection in high resolution images with small objects #2402

Closed IAGirl-dev closed 1 year ago

IAGirl-dev commented 1 year ago

Search before asking

Question

hello, I am using yolov8 to realise an object detection on helmet, however my images have a high resolution (2664x1998). When I run the object detection on all the image no object is detected, but when I slice my image into small images I detection objects on each sub image. is there a way to realise this automatically or a way to improuve the model so that even on a high resolution image very small objects as a helmet be detected please ?

Additional

No response

github-actions[bot] commented 1 year ago

๐Ÿ‘‹ Hello @IAGirl-dev, thank you for your interest in YOLOv8 ๐Ÿš€! We recommend a visit to the YOLOv8 Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a ๐Ÿ› Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training โ“ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Install

Pip install the ultralytics package including all requirements in a Python>=3.7 environment with PyTorch>=1.7.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

Ultralytics CI

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

chenzx2 commented 1 year ago
                img_reized_temp = []
                boxes_temp = []
                classes_temp = []
                scores_temp = []
                cutHight = 2
                cutWidth = 3
                h = int(height / cutHight)
                w = int(width / cutWidth)
                for i in range(cutHight):
                    for j in range(cutWidth):
                        resultImg = frame[i * h:(i + 1) * h, j * w:(j + 1) * w]
                        if (self.isRK3588):
                            img_reized_temp, boxes_temp, classes_temp, scores_temp = rknnLite.infer(resultImg)
                        else:
                            img_reized_temp, boxes_temp, classes_temp, scores_temp = intelOpenVino.infer(resultImg)
                        # ----------ๅขžๅผบๆจกๅผๆ•ฐๆฎๆ‹ผๆŽฅ----------
                        if( scores_temp is not None and len(scores_temp) >= 0):
                            boxes_temp = np.array(boxes_temp)

                            y = np.copy(boxes_temp)
                            y[:, 0] = ( boxes_temp[:, 0]  + 640*j ) / cutWidth #x1
                            y[:, 1] = ( boxes_temp[:, 1]  + 640*i ) / cutHight #y1
                            y[:, 2] = ( boxes_temp[:, 2]  + 640*j ) / cutWidth #x2
                            y[:, 3] = ( boxes_temp[:, 3]  + 640*i ) / cutHight #y2
                            boxes_temp1 = y

                            boxes.extend(boxes_temp1)
                            classes.extend(classes_temp)
                            scores.extend(scores_temp)
chenzx2 commented 1 year ago

you can cut the srcImg,and train it. predict like this

glenn-jocher commented 1 year ago

Hi @chenzx2. Thank you for your question. It is possible to break down a high-resolution image into several small images and perform detection on each. However, this is not automatic and requires some additional coding effort to implement.

In the code that you shared, the image is divided into small rectangular parts, and object detection is performed on each. The output boxes and scores of each of the results are then concatenated as the final result.

Another way to improve object detection on high-resolution images is to adjust the hyperparameters of the YOLOv8 model. You can try increasing the input resolution, decreasing the image augmentation or using a smaller grid size. All these options help the model perceive small objects more effectively.

Please let me know if you have any more questions.

chenzx2 commented 1 year ago

Maybe both is a good choice. Small GPU memery can train yolov8 by cutting the srcImgs

chenzx2 commented 1 year ago

Hi @chenzx2. Thank you for your question. It is possible to break down a high-resolution image into several small images and perform detection on each. However, this is not automatic and requires some additional coding effort to implement.

In the code that you shared, the image is divided into small rectangular parts, and object detection is performed on each. The output boxes and scores of each of the results are then concatenated as the final result.

Another way to improve object detection on high-resolution images is to adjust the hyperparameters of the YOLOv8 model. You can try increasing the input resolution, decreasing the image augmentation or using a smaller grid size. All these options help the model perceive small objects more effectively.

Please let me know if you have any more questions.

Thank you for the suggestion, I'm going to try it

glenn-jocher commented 1 year ago

You're welcome, @chenzx2! I'm glad I could help. Don't hesitate to reach out if you have any more questions or need further assistance. Good luck with your project!

IAGirl-dev commented 1 year ago

hello @glenn-jocher , thank you for your response . I generated a new dataset with high resolution images 2664x1998 and I tried to adjust the hyperparameters to train the model. I am using google colab pro for my training . I am facing now a new problem with the memory limitation. Currently I am looking to realise a segmentation of small objects on high resolution images. is it possible to use tiled inference with yoloV8 for segmentation to resolve this issue , if so how can I use it please ?

glenn-jocher commented 1 year ago

Hi @IAGirl-dev, thank you for your question.

Yes, it is possible to use tiled inference with YOLOv8 to perform segmentation on high-resolution images. Tiled inference involves breaking up the input image into smaller pieces and processing them separately. After processing these smaller images, the resulting segmented outputs are combined to obtain the final output for the full-size image. This approach helps to reduce memory limitations when dealing with high-resolution images.

To use tiled inference with YOLOv8, you need to modify the inference function in yolov8.py to accept tiled input images and perform segmentation on each tile. You can also implement a function to stitch the resulting tiles of segmentation back into a single output image.

You would also need to ensure that your data pipeline splits your high-resolution images into tiles before passing them to YOLOv8 for inference.

I hope this helps. If you have any more questions, feel free to ask. Good luck with your project!

github-actions[bot] commented 1 year ago

๐Ÿ‘‹ Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO ๐Ÿš€ and Vision AI โญ

jehan88 commented 1 year ago

@glenn-jocher how can i use tiled inference could you please provide me the code or the link?

glenn-jocher commented 1 year ago

@jehan88 hello! Thank you for your question. To implement tiled inference in YOLOv8, you'll need to modify the inference function in the yolov8.py file. This involves:

You'll also want to implement a stitching function to combine the segmented outputs from each tile into a single output image. This way, you can maintain the quality of the segmentation across the entire high-resolution image.

Unfortunately, I cannot provide the specific code or a direct link to implement tiled inference. However, you can find code examples and tutorials for tiled inference in YOLOv8 by searching online or referring to relevant papers and articles. This will help you understand the necessary modifications and the stitching process.

I hope this guidance helps! If you have any further questions, please feel free to ask. Good luck with your project!

github-actions[bot] commented 1 year ago

๐Ÿ‘‹ Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO ๐Ÿš€ and Vision AI โญ

iraadit commented 1 year ago

FYI, SAHI can be used now: https://docs.ultralytics.com/guides/sahi-tiled-inference/

iraadit commented 1 year ago

SAHI is doing exactly that, ready to use and using yolov8 (or other networks), without any change required on yolov8, as advertised on your own (Ultralytics's) website, from where the link I posted is coming from.

Here is a Google Colab using it with yolov5: https://colab.research.google.com/github/obss/sahi/blob/main/demo/inference_for_yolov5.ipynb

From SAHI's GitHub: https://github.com/obss/sahi

For yolov8, it suffices to follow the guide on your website (https://docs.ultralytics.com/guides/sahi-tiled-inference/). It would be easy to update the previously linked Google Colab from the content of the guide.

Thanks for the great work on yolo.

NawoyaSarah commented 9 months ago

Hi @IAGirl-dev, thank you for your question.

Yes, it is possible to use tiled inference with YOLOv8 to perform segmentation on high-resolution images. Tiled inference involves breaking up the input image into smaller pieces and processing them separately. After processing these smaller images, the resulting segmented outputs are combined to obtain the final output for the full-size image. This approach helps to reduce memory limitations when dealing with high-resolution images.

To use tiled inference with YOLOv8, you need to modify the inference function in yolov8.py to accept tiled input images and perform segmentation on each tile. You can also implement a function to stitch the resulting tiles of segmentation back into a single output image.

You would also need to ensure that your data pipeline splits your high-resolution images into tiles before passing them to YOLOv8 for inference.

I hope this helps. If you have any more questions, feel free to ask. Good luck with your project!

@glenn-jocher Hello. where do i find the yolov8.py? its the same as the train.py?

glenn-jocher commented 9 months ago

Hello @NawoyaSarah! Apologies for the confusion earlier. There's no need to modify any files like yolov8.py or train.py. For tiled inference with YOLOv8, you can use the SAHI tool, which is designed to handle high-resolution images and small objects detection without any code changes to YOLOv8.

Please refer to the Ultralytics documentation guide on SAHI tiled inference for detailed instructions on how to set it up with YOLOv8. This will guide you through the process and help you get started with tiled inference right away.

If you have any further questions or need more assistance, feel free to reach out. Happy to help!

jasonpp commented 1 month ago

Nice,got it.

glenn-jocher commented 1 month ago

Great to hear! If you have any more questions or need further assistance, feel free to ask.