Cant understand how does yolov5 work.

Aflexg commented 8 months ago

Hello.I dont understand some basic principles of the yolov5 work and need help.Could you, please, answer my questions?

As far as i am concerned, convolutional neural network always has fully connected layer on its exit to predict the class of the object, but i dont see it on this picture.It seems yolov5 output consists of only 3 conv layers, so i cant get how does it predict the class.Could you explain me, please, how does yolov5 predict the class without fully connected layer ?Maybe this scheme is not full and doesnt show the fully connected layer ?

What exactly does the BottleNeckCSP layer ? Is it just better version of the convolution layer ?How do they (BottleNeckCSP and conv layers) interract between each other and why do yolov5 needs ordinary convolution layer if c3 is just better ?

I copied this diagram from here, it is written in Chinese.

Copyright statement: This article is the original article of the blogger and follows the CC 4.0 BY-SA copyright agreement. Please attach the original source link and this statement for reprinting. Link to this article: https://blog.csdn.net/Q1u1NG/article/details/107511465

github-actions[bot] commented 8 months ago

👋 Hello @Aflexg, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Requirements

Python>=3.8.0 with all requirements.txt installed including PyTorch>=1.8. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Notebooks with free GPU:
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Amazon Deep Learning AMI. See AWS Quickstart Guide
Docker Image. See Docker Quickstart Guide

Status

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

Introducing YOLOv8 🚀

We're excited to announce the launch of our latest state-of-the-art (SOTA) object detection model for 2023 - YOLOv8 🚀!

Designed to be fast, accurate, and easy to use, YOLOv8 is an ideal choice for a wide range of object detection, image segmentation and image classification tasks. With YOLOv8, you'll be able to quickly and accurately detect objects in real-time, streamline your workflows, and achieve new levels of accuracy in your projects.

Check out our YOLOv8 Docs for details and get started with:

pip install ultralytics

glenn-jocher commented 8 months ago

@Aflexg hello! I'm glad you're interested in understanding how YOLOv5 works. Let's clarify your questions:

Class Prediction Without Fully Connected Layers: YOLOv5, like other YOLO models, uses a different approach compared to traditional CNNs. Instead of fully connected layers, it uses convolutional layers to predict bounding boxes and class probabilities directly from feature maps. This is achieved through 1x1 convolutions at the end of the network that output a 3D tensor containing the bounding box coordinates, objectness scores, and class predictions for each grid cell.
BottleneckCSP Layer: The BottleneckCSP is a type of layer that improves the learning capability of the network by combining Cross Stage Partial (CSP) connections with bottleneck structures. It helps in reducing the computational cost while maintaining or improving the learning capacity. The interaction between BottleneckCSP and regular convolutional layers is that the BottleneckCSP layers are used to enhance feature extraction and learning efficiency, while the conv layers perform the standard convolution operations. YOLOv5 uses both to balance performance and efficiency.

The diagram you've referenced may not fully represent the architecture or the final layers responsible for predictions. For a more detailed and accurate representation of YOLOv5's architecture, I recommend checking out the official documentation at our Ultralytics Docs page.

I hope this helps clarify your questions! If you have more, feel free to ask. 😊

Aflexg commented 8 months ago

Thanks a lot for your reply.It seems i got it.Very good and simple explanation.

glenn-jocher commented 8 months ago

You're welcome, @Aflexg! I'm thrilled to hear that the explanation was helpful. If you have any more questions in the future or need further clarification, don't hesitate to reach out. Happy coding and best of luck with your YOLOv5 projects! 😄🚀

ultralytics / yolov5