ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.88k stars 16.38k forks source link

Size of bounding box in Output Tensor is of shape 6 instead of 4 #11137

Closed iammohit1311 closed 1 year ago

iammohit1311 commented 1 year ago

Search before asking

Question

Screenshot from 2023-03-09 01-34-06

I have trained a YOLOv5 model for Custom object detection. Then exported it to .tflite format. As you can see in the image, the Output Tensor is a Location Tensor and its shape is coming out to be 6. But it needs to be 4 as that is what we need to draw the Bounding box. I don't understand why it is 6 because I didn't specify it anywhere to be 6. How do I modify it to be 4 ?

Also refer this link for understanding why I need the shape of Output Tensor to be 4: https://www.tensorflow.org/lite/inference_with_metadata/task_library/object_detector#model_compatibility_requirements

Additional

No response

github-actions[bot] commented 1 year ago

👋 Hello @iammohit1311, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Requirements

Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

YOLOv5 CI

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

Introducing YOLOv8 🚀

We're excited to announce the launch of our latest state-of-the-art (SOTA) object detection model for 2023 - YOLOv8 🚀!

Designed to be fast, accurate, and easy to use, YOLOv8 is an ideal choice for a wide range of object detection, image segmentation and image classification tasks. With YOLOv8, you'll be able to quickly and accurately detect objects in real-time, streamline your workflows, and achieve new levels of accuracy in your projects.

Check out our YOLOv8 Docs for details and get started with:

pip install ultralytics
iammohit1311 commented 1 year ago

@glenn-jocher

glenn-jocher commented 1 year ago

YOLOv5 outputs are generally [xywh, objectness, class confidences i.e. 0-79 for COCO]

iammohit1311 commented 1 year ago

YOLOv5 outputs are generally [xywh, objectness, class confidences i.e. 0-79 for COCO]

Is it possible to modify the Output Tensor somehow to always exclude objectness & class confidence ?

glenn-jocher commented 1 year ago

@iammohit1311 sure, this is open-source code, you can modify anything you want.

iammohit1311 commented 1 year ago

@iammohit1311 sure, this is open-source code, you can modify anything you want.

Thanks! Can you please help me locate the file I should modify in your repo for this purpose ?

Burhan-Q commented 1 year ago

Maybe this is a silly question, but if you have a Tensor with shape 6, why not just select what you need from that Tensor?

torch.tensor[:, :4]

To only select the bounding box coordinates?

Modifying the source code would probably be in the Detections class of yolov5/common.py

iammohit1311 commented 1 year ago

Maybe this is a silly question, but if you have a Tensor with shape 6, why not just select what you need from that Tensor?

torch.ensor[:, :4]

To only select the bounding box coordinates?

Modifying the source code would probably be in the Detections class of yolov5/common.py

Yes, I tried slicing but android did not accept that trick. I will look into common.py though

glenn-jocher commented 1 year ago

@iammohit1311, it's great to hear you tried that approach already, I apologize that it didn't work out for you.

Modifying the Detections class of yolov5/common.py would likely be the solution to your issue. However, please keep in mind that modifying the source code has implications and could break compatibility with the original implementation. It would be best to create a new function with a different name and a new signature that works for your use case.

If you need further assistance with modifying the code, the YOLOv5 community on GitHub and Ultralytics team would be more than happy to help you out. We recommend following the YOLOv5 Contributing Guidelines to ensure that your changes are in line with the project's vision and quality standards.

Good luck with your implementation, and please let us know if there's anything else we can do to help!

iammohit1311 commented 1 year ago

@iammohit1311, it's great to hear you tried that approach already, I apologize that it didn't work out for you.

Modifying the Detections class of yolov5/common.py would likely be the solution to your issue. However, please keep in mind that modifying the source code has implications and could break compatibility with the original implementation. It would be best to create a new function with a different name and a new signature that works for your use case.

If you need further assistance with modifying the code, the YOLOv5 community on GitHub and Ultralytics team would be more than happy to help you out. We recommend following the YOLOv5 Contributing Guidelines to ensure that your changes are in line with the project's vision and quality standards.

Good luck with your implementation, and please let us know if there's anything else we can do to help!

@glenn-jocher Thank you so much for guiding me in this! I would like to work on making YOLOv5 100% compatible with Android implementation.

I would like to add: a new function with a different name and a new signature that exports YOLOv5 custom model in TensorFlow Lite format (.tflite), where the Output Tensor has a shape of 4 i.e. [x, y, w, h] by default. This will help the Native Android Development community a lot since the function that draws a bounding box in Android needs Output Tensor in the above mentioned format. Please let me know how can I contribute to your repository. Thank you!

glenn-jocher commented 1 year ago

@iammohit1311, that's fantastic to hear! We're always happy to have contributions from the community.

Thank you for considering adding a new function to YOLOv5 that exports the model in TensorFlow Lite format with a shape of 4 for the Output Tensor by default. This will certainly be useful to the Android community.

To contribute to YOLOv5, please follow the steps in our Contributing Guidelines. Make sure to include a detailed description of your changes, including the new function and its signature, and test your code thoroughly before submitting a pull request.

We appreciate your willingness to contribute, and we're looking forward to seeing your changes! Please don't hesitate to reach out if you have any questions or concerns along the way.

github-actions[bot] commented 1 year ago

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

muhammadriyannnn commented 1 year ago

@iammohit1311 hello sir, i have the same problem. i want to deploy the yolov5 tflite model but the tflite model no metada. i trying to train with yolov8, and the tensors output has one shape only, not four. can you help me?

glenn-jocher commented 1 year ago

Hi @mmmriyan,

Thank you for reaching out. I understand that you are encountering a problem with deploying the YOLOv5 TensorFlow Lite (TFLite) model and are missing the metadata. If you trained the model with YOLOv8, it could explain the different shape of the output tensor.

To help you better, I need some additional information:

Please provide these details, and I'll do my best to assist you further.

Thank you!

muhammadriyannnn commented 1 year ago

hello sir @glenn-jocher thanks for your response.

i successfully converted my YOLOv8 model to TFLite format, but the tensors output has different from other models that have 4 shapes (location, category, score, and number of detections).

i hope you can help me sir! thanks before

glenn-jocher commented 1 year ago

Hi @mmmriyan,

Thank you for providing the additional information.

If your YOLOv8 model converted to TFLite format has a different shape in the output tensor compared to other models (location, category, score, and number of detections), it might indicate differences in the internal architecture of the model. YOLOv8 is not an official version of YOLO, so it could have variations or modifications that affect the output tensor shape.

To investigate this further and help you resolve the issue, could you please provide some additional details:

With this information, we can try to identify the cause of the issue and provide you with appropriate guidance.

Thank you for your understanding, and I'll be waiting for your response.

muhammadriyannnn commented 1 year ago

@glenn-jocher the tensors output YOLOv8 TFLite is like this: image

and the error messages after running app is like this: image

glenn-jocher commented 1 year ago

Hi @mmmriyan,

Thank you for sharing the details and the images illustrating the output tensor shape and the error messages.

The output tensor shape you provided appears to have a size of 16 instead of the expected 4. This difference indicates that the YOLOv8 TFLite model may have a different internal architecture from other versions of YOLO models, resulting in a modified output tensor shape.

Regarding the error messages, the image you shared shows an error related to "ObjectDetecionTfliteDelegate", which suggests a problem with the TFLite delegate used for object detection. It's possible that the YOLOv8 model requires a different delegate setup or additional modifications for correct inference.

To address this issue, I would suggest the following steps:

  1. Check the conversion process: Ensure that you have converted the YOLOv8 model to TFLite format correctly. Double-check the conversion script or process to confirm that no mistakes were made during the conversion.

  2. Verify the model architecture: Review the YOLOv8 model architecture and ensure that it matches the expected output tensor shape. Any differences in the architecture could result in the modified output tensor shape.

  3. Adapt the inference code: If the model architecture is indeed different, you will need to modify the inference code accordingly. Update the code responsible for interpreting the output tensor to handle the shape difference and appropriately extract the relevant information.

If the provided steps do not resolve the issue, I recommend seeking assistance from the YOLOv5 community on GitHub. They have a wealth of knowledge and expertise and can provide specific guidance based on the YOLOv8 architecture and the TFLite conversion process.

Good luck with your implementation, and please let us know if there's anything else we can do to assist!

iammohit1311 commented 1 year ago

@iammohit1311 hello sir, i have the same problem. i want to deploy the yolov5 tflite model but the tflite model no metada. i trying to train with yolov8, and the tensors output has one shape only, not four. can you help me?

Hey! I can guide you on how to inject Metadata in your YOLOv5 TFlite model. I never used YOLOv8 so no idea about its shape but I will look into other comments to resolve your issue. Though even after injecting Metadata in YOLOv5 model, you will face the Output Tensor mismatch issue since YOLOv5 model's Output Tensor has shape 6

iammohit1311 commented 1 year ago

Maybe this is a silly question, but if you have a Tensor with shape 6, why not just select what you need from that Tensor?

torch.tensor[:, :4]

To only select the bounding box coordinates?

Modifying the source code would probably be in the Detections class of yolov5/common.py

That isn't an option due to the way Detection results are derived in Android. We need to modify the Detections class it seems.

glenn-jocher commented 1 year ago

@iammohit1311 the reason selecting only the first four values from a 6-shape tensor is not a suitable solution is because the Android detection pipeline relies on the specific format and shape of the output tensor. Modifying the Detections class in the yolov5/common.py file seems like a logical step to address this issue.

By adapting the Detections class, you can customize the way detection results are derived in order to meet the requirements of the Android pipeline. This will help ensure compatibility and enable proper utilization of the bounding box coordinates.

If you need further assistance or have any other questions, please don't hesitate to ask.