Closed 229249829 closed 2 years ago
@229249829 you can update the architecture by modifying the model yaml file, i.e. see the models directory: https://github.com/ultralytics/yolov5/blob/3d897986c794cad8e00b17718a1be9bcb61ef76d/models/yolov5s.yaml#L3-L48
@229249829 you can update the architecture by modifying the model yaml file, i.e. see the models directory:
Do I need to add the new module in yolo. py file first and then modify it in model yaml file? Do I need to add my modified model yaml file path in cfg parameter after modification?
@229249829 yes you can create new modules in experimental.py and then import them in yolo.py, and do any special handling they need in the model parser: https://github.com/ultralytics/yolov5/blob/0eb37ad8afdc058472221e7405d6ca201678de97/models/yolo.py#L250-L301
π Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.
Access additional YOLOv5 π resources:
Access additional Ultralytics β‘ resources:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLOv5 π and Vision AI β!
Hi, I wonder where is the neck of the YOLOv5 model, how can I modify it? There is no neck in this yaml files.
Hi , I wanted to add some logic post the detection of objects on my custom dataset using yolo as the model is getting confused between 2 similar looking objects , where should I add the code , which file , which section
@Srishti-55 the "neck" section in YOLOv5 typically refers to feature fusion modules like PANet or FPN, which are not present in the YOLOv5 architecture. To modify the post-detection logic for your custom dataset, you can add your code in 'models/yolo.py' within the 'forward' method after the detection phase, around line 211. You can access the detected objects and their attributes at this stage for further processing.
For example, I want to modify yolov5s backbone. What I need to do?
Hi @Minhaz1710,
Great question! To modify the YOLOv5s backbone, you'll need to follow these steps:
Modify the YAML File:
models
directory and locate the yolov5s.yaml
file. This file defines the architecture of the YOLOv5s model, including the backbone.yolov5s.yaml
file, you'll see sections like backbone
and head
. You can modify the backbone
section to include your custom layers or changes.Example:
backbone:
# Your custom backbone layers here
Add Custom Modules:
models/common.py
or models/experimental.py
files.models/yolo.py
.Update the Model Parsing Logic:
models/yolo.py
, ensure that the model parsing logic can handle your custom backbone. This might involve updating the parse_model
function to recognize and correctly initialize your custom layers.Training with the Modified Model:
--cfg
parameter when running the training script.Example:
python train.py --cfg models/custom_yolov5s.yaml --data data/coco.yaml --weights '' --batch-size 16
Here's a brief example of how you might add a custom convolutional layer to the backbone in yolov5s.yaml
:
backbone:
# Custom backbone layers
[[-1, 1, Conv, [64, 3, 1]], # Custom Conv layer
[-1, 1, Conv, [128, 3, 2]],
... # Continue with other layers
]
And in models/common.py
:
class CustomConv(nn.Module):
def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True):
super(CustomConv, self).__init__()
self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False)
self.bn = nn.BatchNorm2d(c2)
self.act = nn.SiLU() if act else nn.Identity()
def forward(self, x):
return self.act(self.bn(self.conv(x)))
Finally, import your custom module in models/yolo.py
:
from models.common import CustomConv
I hope this helps! If you have any further questions, feel free to ask. Happy coding! π
Hi, glenn-jocher, thank you for your amazing reply. Can you please tell me? I am struggling to understand the yolov7-tiny. yaml file, models/common.py file, and model/experimental.py file. I also not clear how I will add Feature Pyramid Network (FPN) in yolov7-tiny structure and finally, how I will import in models/yolo.py Can you please tell me or show me a step by step guideline, which i can follow and understand the overall process, from understanding yolov7-tiny.yaml file, models/common.py file, and model/experimental.py file, add FPN in yolov7-tiny structure and finally train images using the modified model? than you π
Hi @Minhaz1710,
Thank you for your kind words! π I'd be happy to help you understand the process of modifying the YOLOv7-tiny model to include a Feature Pyramid Network (FPN). Here's a step-by-step guide:
The yolov7-tiny.yaml
file defines the architecture of the model. It includes sections for the backbone and head of the network. Each layer is defined with its type, input, and parameters.
To add an FPN, you'll need to modify the yolov7-tiny.yaml
file to include the FPN layers. Here's an example of how you might define an FPN in the YAML file:
backbone:
# Your existing backbone layers
...
# Add FPN layers
[[-1, 1, Conv, [256, 1, 1]], # FPN layer 1
[-1, 1, Conv, [128, 3, 1]], # FPN layer 2
[-1, 1, Conv, [64, 3, 1]]] # FPN layer 3
models/common.py
If your FPN requires custom layers, you can define them in models/common.py
. Here's an example of a custom FPN module:
class FPN(nn.Module):
def __init__(self, c1, c2):
super(FPN, self).__init__()
self.conv1 = nn.Conv2d(c1, c2, 1)
self.conv2 = nn.Conv2d(c2, c2, 3, padding=1)
self.conv3 = nn.Conv2d(c2, c2, 3, padding=1)
def forward(self, x):
x1 = self.conv1(x)
x2 = self.conv2(x1)
x3 = self.conv3(x2)
return x3
models/yolo.py
Next, import your custom FPN module in models/yolo.py
and update the model parsing logic to include it:
from models.common import FPN
def parse_model(d, ch): # model_dict, input_channels(3)
...
if m == 'FPN':
c1, c2 = args
m_ = FPN(c1, c2)
...
Finally, train your model using the modified YAML file. Specify the path to your modified YAML file using the --cfg
parameter when running the training script:
python train.py --cfg models/custom_yolov7-tiny.yaml --data data/coco.yaml --weights '' --batch-size 16
For more details on the YOLOv5 architecture and modifications, you can refer to the Ultralytics YOLOv5 Architecture Documentation.
I hope this helps! If you have any further questions, feel free to ask. Happy coding! π
Hi, I am Minhaz. I have trained images using YOLOv8 algorithm. After training and validation I can see the details result. Now I want to see the detection result as like as the training and validation result. I attached a picture of my training and validation result. Can anyone please help me, how can I see the detection result as a chart, using best.pt file? Thank you!.
Hi Minhaz,
Thank you for reaching out! It's great to hear that you've successfully trained your model using YOLOv8. To visualize the detection results using your best.pt
file, you can follow these steps:
Run Inference:
First, use your best.pt
model to run inference on your test dataset. You can do this using the detect.py
script provided in the YOLOv5 repository.
python detect.py --weights best.pt --source path/to/your/test/images --save-txt --save-conf
This command will generate detection results and save them in the specified directory.
Evaluate Detection Results:
To visualize the detection results as charts, you can use the val.py
script to evaluate the performance of your model on the test dataset. This script will generate various metrics and plots, including precision-recall curves and confusion matrices.
python val.py --weights best.pt --data path/to/your/data.yaml --task test
Make sure your data.yaml
file is correctly configured to point to your test dataset.
Visualize Results:
After running the val.py
script, you can find the evaluation results in the runs/val/exp
directory. This directory contains various plots and metrics that summarize the performance of your model on the test dataset.
If you want to generate custom plots or further analyze the results, you can use the results.csv
file generated during validation. Here's an example of how you can plot the results using Python:
import pandas as pd
import matplotlib.pyplot as plt
# Load results
results = pd.read_csv('runs/val/exp/results.csv')
# Plot precision-recall curve
plt.figure()
plt.plot(results['precision'], results['recall'], label='Precision-Recall Curve')
plt.xlabel('Recall')
plt.ylabel('Precision')
plt.title('Precision-Recall Curve')
plt.legend()
plt.show()
# Plot other metrics as needed
These steps should help you visualize the detection results in a similar manner to your training and validation results. If you encounter any issues or have further questions, feel free to ask. Happy coding! π
Thank you @glenn-jocher Now, I want to add FPN in yolov8n structure then I will used the model for small object detection ( training then later detection using best.pt file). The training and detection process I will do using NVIDIA GPU. Can you please tell me or show me a step by step guideline, which i can follow and understand the overall process, from adding FPN layer in YOLOv8n architecture, what needs to change the YOLOv8n.yaml file and train python file or any other file from ultralytics to add FPN layer in YOLOv8n architecture. Can you please show me the step by step process of the procedure. Thank you.
Hi @Minhaz1710,
Thank you for your question! Adding a Feature Pyramid Network (FPN) to the YOLOv8n architecture for improved small object detection is a great idea. Hereβs a step-by-step guide to help you through the process:
First, you'll need to modify the yolov8n.yaml
file to include the FPN layers. This file defines the architecture of the YOLOv8n model.
Example modification:
backbone:
# Your existing backbone layers
...
# Add FPN layers
[[-1, 1, Conv, [256, 1, 1]], # FPN layer 1
[-1, 1, Conv, [128, 3, 1]], # FPN layer 2
[-1, 1, Conv, [64, 3, 1]]] # FPN layer 3
If your FPN requires custom layers, you can define them in models/common.py
. Hereβs an example of a custom FPN module:
class FPN(nn.Module):
def __init__(self, c1, c2):
super(FPN, self).__init__()
self.conv1 = nn.Conv2d(c1, c2, 1)
self.conv2 = nn.Conv2d(c2, c2, 3, padding=1)
self.conv3 = nn.Conv2d(c2, c2, 3, padding=1)
def forward(self, x):
x1 = self.conv1(x)
x2 = self.conv2(x1)
x3 = self.conv3(x2)
return x3
Next, import your custom FPN module in models/yolo.py
and update the model parsing logic to include it:
from models.common import FPN
def parse_model(d, ch): # model_dict, input_channels(3)
...
if m == 'FPN':
c1, c2 = args
m_ = FPN(c1, c2)
...
Once you've made your changes, you can train your model using the modified YAML file. Specify the path to your modified YAML file using the --cfg
parameter when running the training script:
python train.py --cfg models/custom_yolov8n.yaml --data data/coco.yaml --weights '' --batch-size 16
After training, you can run inference using your best.pt
model to see the detection results:
python detect.py --weights best.pt --source path/to/your/test/images --save-txt --save-conf
Feel free to ask if you have any further questions. Happy coding! π
Can you please tell me, what are the differences between FPN layer and convolutional layer? In details.
Hi @Minhaz1710,
Thank you for your question! I'd be happy to explain the differences between Feature Pyramid Networks (FPN) and convolutional layers.
A convolutional layer is a fundamental building block of Convolutional Neural Networks (CNNs). It applies convolution operations to the input data, which involves sliding a filter (or kernel) over the input to produce feature maps. Key characteristics include:
Example of a convolutional layer in PyTorch:
import torch.nn as nn
conv_layer = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=3, stride=1, padding=1)
An FPN is a more complex architecture designed to enhance the feature extraction capabilities of CNNs, especially for object detection tasks. It builds a multi-scale feature pyramid from a single-resolution input, allowing the network to detect objects at various scales. Key characteristics include:
Example of an FPN module in PyTorch:
class FPN(nn.Module):
def __init__(self, c1, c2):
super(FPN, self).__init__()
self.conv1 = nn.Conv2d(c1, c2, 1)
self.conv2 = nn.Conv2d(c2, c2, 3, padding=1)
self.conv3 = nn.Conv2d(c2, c2, 3, padding=1)
def forward(self, x):
x1 = self.conv1(x)
x2 = self.conv2(x1)
x3 = self.conv3(x2)
return x3
I hope this clarifies the differences for you! If you have any further questions, feel free to ask. π
Thank you @glenn-jocher. I am studying yolo structures from last five months. I learned many things about that. Now, studying about yolov8. Here I attached a modified yolov8.yaml file. Can you please add any two layers in the backbone and two layers in the head to improve the tiny object detection. If its possible please mentioned some architecture layer name, which can improve the medium size object detection accuracy and and their layer structure like ( - [-1, 1, Conv, [128, 3, 2]] ) this is a convolutional layer. I think, at this time I am at intermediate level to understand and modify the yolo architecture. yolov8-modified.txt
Hi @Minhaz1710,
Thank you for your continued interest in YOLO architectures and for sharing your modified yolov8.yaml
file! It's great to see your progress and enthusiasm. Let's enhance your YOLOv8 model to improve tiny and medium-sized object detection.
To improve the detection of tiny objects, you can add more convolutional layers to the backbone. These layers can help extract finer details from the input images. Here's an example of adding two convolutional layers:
backbone:
# Existing layers
...
# Add new layers
- [-1, 1, Conv, [128, 3, 1]] # New Conv layer
- [-1, 1, Conv, [256, 3, 2]] # New Conv layer
For the head, adding layers that can better aggregate features from different scales can be beneficial. You might consider adding a SPPF
(Spatial Pyramid Pooling - Fast) layer and an additional convolutional layer:
head:
# Existing layers
...
# Add new layers
- [-1, 1, SPPF, [256, 5]] # New SPPF layer
- [-1, 1, Conv, [128, 3, 1]] # New Conv layer
For medium-sized object detection, you can add layers that help in capturing medium-scale features. A C3
layer (Cross Stage Partial Network) can be effective:
backbone:
# Existing layers
...
# Add new layers
- [-1, 1, C3, [256, 3, 1]] # New C3 layer
- [-1, 1, Conv, [512, 3, 2]] # New Conv layer
yolov8.yaml
Hereβs how your modified yolov8.yaml
might look with the new layers added:
backbone:
# Existing layers
...
# Add new layers for tiny object detection
- [-1, 1, Conv, [128, 3, 1]] # New Conv layer
- [-1, 1, Conv, [256, 3, 2]] # New Conv layer
# Add new layers for medium-sized object detection
- [-1, 1, C3, [256, 3, 1]] # New C3 layer
- [-1, 1, Conv, [512, 3, 2]] # New Conv layer
head:
# Existing layers
...
# Add new layers
- [-1, 1, SPPF, [256, 5]] # New SPPF layer
- [-1, 1, Conv, [128, 3, 1]] # New Conv layer
Feel free to adjust the parameters (like the number of filters and kernel sizes) based on your specific needs and dataset characteristics.
After modifying the YAML file, you can train your model as usual:
python train.py --cfg models/custom_yolov8.yaml --data data/coco.yaml --weights '' --batch-size 16
I hope this helps! If you have any further questions or need additional assistance, feel free to ask. Happy coding! π
Thank you @glenn-jocher for your informative information. I attached a picture which clearly depicts the convolutional layer structure. The convolutional layer structure is consists of a 2d convolutional layer, a 2d batch normalization and a SiLU activation function.
Can you please tell me the structure of C3 layer, how its work and details about this layer. Thank you again.
Hi @Minhaz1710,
Thank you for your kind words and for sharing the detailed image of the convolutional layer structure! π
The C3 layer, or Cross Stage Partial Network (CSPNet) layer, is a more advanced module designed to enhance feature extraction and reduce computational cost. It is particularly effective in improving the learning capability of the network by integrating gradient changes into the feature map.
The C3 layer typically consists of the following components:
Hereβs a more detailed look at the C3 layer structure:
class C3(nn.Module):
def __init__(self, in_channels, out_channels, num_blocks, shortcut=True, groups=1, expansion=0.5):
super(C3, self).__init__()
hidden_channels = int(out_channels * expansion)
self.cv1 = Conv(in_channels, hidden_channels, 1, 1)
self.cv2 = Conv(in_channels, hidden_channels, 1, 1)
self.cv3 = Conv(2 * hidden_channels, out_channels, 1) # Concatenate and fuse
self.m = nn.Sequential(*[Bottleneck(hidden_channels, hidden_channels, shortcut, groups) for _ in range(num_blocks)])
def forward(self, x):
return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), dim=1))
cv1
, cv2
, and cv3
are convolutional layers used for initial transformation, bypass, and final fusion.self.m
) that process part of the input feature map.cv3
).The C3 layer is designed to improve the learning capability of the network by integrating gradient changes into the feature map, making it particularly effective for tasks like object detection.
I hope this detailed explanation helps! If you have any further questions or need additional clarification, feel free to ask. Happy coding! π
I really thankful to you @glenn-jocher.
Hi @Minhaz1710,
Thank you for your questions and for your continued interest in YOLOv5! I'm happy to help clarify and guide you further.
Yes, the C3 layer and the CSPNet (Cross Stage Partial Network) layer refer to the same concept. The C3 layer is an implementation of the CSPNet architecture, which is designed to improve the learning capability and efficiency of the network by integrating gradient changes into the feature map.
To study different layers used in YOLOv5, you can refer to the models/common.py
file in the YOLOv5 repository. This file contains the definitions of various layers and modules used in the architecture. Here are a few examples of layers you might encounter:
Here are some example code snippets from models/common.py
to help you get started:
class Conv(nn.Module):
def __init__(self, c1, c2, k=1, s=1, p=None, g=1, act=True):
super(Conv, self).__init__()
self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p), groups=g, bias=False)
self.bn = nn.BatchNorm2d(c2)
self.act = nn.SiLU() if act else nn.Identity()
def forward(self, x):
return self.act(self.bn(self.conv(x)))
class C3(nn.Module):
def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
super(C3, self).__init__()
c_ = int(c2 * e) # hidden channels
self.cv1 = Conv(c1, c_, 1, 1)
self.cv2 = Conv(c1, c_, 1, 1)
self.cv3 = Conv(2 * c_, c2, 1) # concatenate
self.m = nn.Sequential(*[Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)])
def forward(self, x):
return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), dim=1))
class SPPF(nn.Module):
def __init__(self, c1, c2, k=5):
super(SPPF, self).__init__()
c_ = c1 // 2 # hidden channels
self.cv1 = Conv(c1, c_, 1, 1)
self.cv2 = Conv(c_ * 4, c2, 1, 1)
self.m = nn.MaxPool2d(kernel_size=k, stride=1, padding=k // 2)
def forward(self, x):
x = self.cv1(x)
return self.cv2(torch.cat((x, self.m(x), self.m(self.m(x)), self.m(self.m(self.m(x)))), 1))
For a deeper understanding, you can explore the YOLOv5 repository and read through the models/yolo.py
and models/common.py
files. Additionally, the YOLOv5 documentation provides valuable insights and explanations about the architecture and its components.
Feel free to reach out if you have any more questions or need further assistance. Happy studying! π
Hello, this is Minhaz, come to you again. can you please write a FPN layer just before the SPPF layer, and please describe the the value of the added layers meaning? See below. This is my leaning experiment. Thank you.
backbone:
Here, please add a FPN layer.
Hi Minhaz,
Sure, here's how you can add an FPN layer before the SPPF layer in your backbone:
backbone:
# [from, repeats, module, args]
- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
- [-1, 3, C2f, [128, True]]
- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
- [-1, 6, C2f, [256, True]]
- [-1, 1, Conv, [512, 3, 2]] # 5-P4/16
- [-1, 6, C2f, [512, True]]
- [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32
- [-1, 3, C2f, [1024, True]]
- [-1, 1, FPN, [1024, 256]] # New FPN layer
- [-1, 1, SPPF, [1024, 5]] # 9
In this context:
FPN
: The Feature Pyramid Network layer.1024
: Input channels.256
: Output channels.This setup should help in enhancing feature extraction before the SPPF layer. Let me know if you need further assistance.
Thank you @glenn-jocher.
We know that a convolutional block consisting with a 2d convolutional layer a 2d batch normalization layer and a SiLU activation function.
Can you please tell me which layer consist the FPN block, and what do those layers do?
Thank you for your question. The FPN block typically consists of convolutional layers, upsampling layers, and lateral connections. These layers work together to combine high-level semantic features with lower-level features, enhancing multi-scale feature representation for better object detection. For detailed implementation, you can refer to the models/common.py
file in the YOLOv5 repository.
I tried to detect images using trained best.pt file. I used yolov7-tiny model for the training. Now, I am trying to detect objects using the best.pt file in google colab, but there is an error showing. the code is
!python detect.py --weights /content/gdrive/MyDrive/yolov7tinybest.pt/best.pt --conf-thres 0.5 --img-size 640 --source /content/gdrive/MyDrive/test_images
and the error is
Namespace(weights=['/content/gdrive/MyDrive/yolov7tinybest.pt/best.pt'], source='/content/gdrive/MyDrive/test_images', img_size=640, conf_thres=0.5, iou_thres=0.45, device='', view_img=False, save_txt=False, save_conf=False, nosave=False, classes=None, agnostic_nms=False, augment=False, update=False, project='runs/detect', name='exp', exist_ok=False, no_trace=False) YOLOR π 52f8176 torch 2.3.1+cu121 CUDA:0 (Tesla T4, 15102.0625MB)
Traceback (most recent call last): File "/content/gdrive/MyDrive/yolov7/utils/google_utils.py", line 26, in attempt_download assets = [x['name'] for x in response['assets']] # release assets KeyError: 'assets'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/content/gdrive/MyDrive/yolov7/detect.py", line 200, in
what is the problem and how can I solved that?
I want to calculate fps of my trained yolov8 model. I successfully detected objects from the yolov8 model using trained best.pt file in my NVIDIA GPU. to detect the objects I write the code: yolo detect predict model=runs\detect\train81\weights\best.pt source="G:\yolov8-gpu-env-last\data\images200\images\test" save=True Can you please write the code which i will used and calculated the FPS of the trained best.pt model? thank you
To calculate FPS for your YOLOv8 model, you can modify your detection script to include timing. Here's an example:
import time
from ultralytics import YOLO
model = YOLO('runs/detect/train81/weights/best.pt')
source = 'G:/yolov8-gpu-env-last/data/images200/images/test'
start_time = time.time()
results = model.predict(source=source, save=True)
end_time = time.time()
fps = len(results) / (end_time - start_time)
print(f"FPS: {fps}")
This script will load your model, run predictions, and calculate the FPS based on the total time taken.
βQuestion
How can I modify the yoloV5 model to create my own model? In which file do you make the changes?
Additional context
I want to add my own ideas to the YoloV5 model, but I don't know in which folder to modify the model