Open tjasmin111 opened 5 months ago
@tjasmin111 hello! To change the activation function in YOLOv8, you'll need to modify the model's architecture configuration file, which is typically a YAML file. Look for the activation
field within the layers you wish to change and update it to the desired activation function. As for pruning, YOLOv8 does not include a built-in pruning method, so you would need to implement this yourself or use a third-party tool.
For a detailed guide on how to customize the model architecture, please refer to our documentation on model configuration at https://docs.ultralytics.com. Keep in mind that any modifications to the model's architecture or training process should be done with care to ensure the integrity of your experiments. Good luck with your comparisons! ๐
Thanks. Is there a way to change activation function inside data.yaml instead?
@tjasmin111 no, the activation function cannot be changed within the data.yaml
file. It must be modified in the model's architecture configuration file (YAML). Please refer to the documentation for guidance on customizing the model architecture. ๐ ๏ธ
I looked at documentations, couldn't find it. Here is the yolov8.yaml
. Can you please just say how can I change the activation function to LeakyRelu
here?
# Ultralytics YOLO ๐, AGPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect
# Parameters
nc: 80 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n'
# [depth, width, max_channels]
n: [0.33, 0.25, 1024] # YOLOv8n summary: 225 layers, 3157200 parameters, 3157184 gradients, 8.9 GFLOPs
s: [0.33, 0.50, 1024] # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients, 28.8 GFLOPs
m: [0.67, 0.75, 768] # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients, 79.3 GFLOPs
l: [1.00, 1.00, 512] # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPs
x: [1.00, 1.25, 512] # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs
# YOLOv8.0n backbone
backbone:
# [from, repeats, module, args]
- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
- [-1, 3, C2f, [128, True]]
- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
- [-1, 6, C2f, [256, True]]
- [-1, 1, Conv, [512, 3, 2]] # 5-P4/16
- [-1, 6, C2f, [512, True]]
- [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32
- [-1, 3, C2f, [1024, True]]
- [-1, 1, SPPF, [1024, 5]] # 9
# YOLOv8.0n head
head:
- [-1, 1, nn.Upsample, [None, 2, 'nearest']]
- [[-1, 6], 1, Concat, [1]] # cat backbone P4
- [-1, 3, C2f, [512]] # 12
- [-1, 1, nn.Upsample, [None, 2, 'nearest']]
- [[-1, 4], 1, Concat, [1]] # cat backbone P3
- [-1, 3, C2f, [256]] # 15 (P3/8-small)
- [-1, 1, Conv, [256, 3, 2]]
- [[-1, 12], 1, Concat, [1]] # cat head P4
- [-1, 3, C2f, [512]] # 18 (P4/16-medium)
- [-1, 1, Conv, [512, 3, 2]]
- [[-1, 9], 1, Concat, [1]] # cat head P5
- [-1, 3, C2f, [1024]] # 21 (P5/32-large)
- [[15, 18, 21], 1, Detect, [nc]] # Detect(P3, P4, P5)
@tjasmin111 to change the activation function to LeakyReLU
in your YOLOv8 model configuration, you'll need to locate each Conv
and C2f
module in the backbone
and head
sections of your YAML file and add an activation
argument with the value LeakyReLU
. For example, change lines like - [-1, 1, Conv, [64, 3, 2]]
to - [-1, 1, Conv, [64, 3, 2, LeakyReLU]]
. Repeat this for all relevant layers. Remember to maintain the YAML file's structure and indentation. ๐ ๏ธ
Got it thanks! Is there a way to verify it when I run train? Does it print it?
@tjasmin111 Yes, when you start training, the model summary printed to the console will include the activation functions used in each layer. You can verify the changes there. Happy training! ๐
@tjasmin111 to change the activation function to
LeakyReLU
in your YOLOv8 model configuration, you'll need to locate eachConv
andC2f
module in thebackbone
andhead
sections of your YAML file and add anactivation
argument with the valueLeakyReLU
. For example, change lines like- [-1, 1, Conv, [64, 3, 2]]
to- [-1, 1, Conv, [64, 3, 2, LeakyReLU]]
. Repeat this for all relevant layers. Remember to maintain the YAML file's structure and indentation. ๐ ๏ธ
An error occur when I change - [-1, 1, Conv, [64, 3, 2]]
to - [-1, 1, Conv, [64, 3, 2, LeakyReLU]]
.
ValueError: Invalid padding string 'LeakyReLU', should be one of {'same',valid'}
@tjasmin111 to change the activation function to
LeakyReLU
in your YOLOv8 model configuration, you'll need to locate eachConv
andC2f
module in thebackbone
andhead
sections of your YAML file and add anactivation
argument with the valueLeakyReLU
. For example, change lines like- [-1, 1, Conv, [64, 3, 2]]
to- [-1, 1, Conv, [64, 3, 2, LeakyReLU]]
. Repeat this for all relevant layers. Remember to maintain the YAML file's structure and indentation. ๐ ๏ธ
I located anaconda3\Lib\site-packages\ultralytics\cfg\models\v8\yolov8.yaml
file and added the LeakyReLU arguments to those lines. But no effect. I also don't see them in the summary when training. Are you sure this approach works?
Using Netron, I still see sigmoid as the function in the .onnx model:
@tjasmin111 apologies for the confusion earlier. To correctly change the activation function to LeakyReLU
, you need to specify it as a string in the YAML configuration. For example, change - [-1, 1, Conv, [64, 3, 2]]
to - [-1, 1, Conv, [64, 3, 2, 'LeakyReLU']]
. Ensure that 'LeakyReLU' is enclosed in quotes to be recognized as a string. After making these changes, the model summary during training should reflect the updated activation functions. If you continue to encounter issues, please ensure you are editing the correct YAML file that is being used for training. ๐ ๏ธ
I see. Do I need to pass an argument? 'LeakyReLU (0.1)' ? Does it make a difference?
@tjasmin111 No additional arguments are needed; simply use 'LeakyReLU'
in the YAML file. The default negative slope of 0.1 is automatically applied for LeakyReLU in YOLOv8. ๐
I tried this fix but it didn't work, but I managed to change it by adding a parameter to yolov8.yaml
# Parameters
...
activation: nn.LeakyReLU(0.1) # <----- Conv() activation used throughout entire YOLOv8 model
@aleshem Great to hear you managed to change the activation function. Your approach of adding a global activation
parameter to the yolov8.yaml
is indeed a valid method to update the activation function across the entire model. Keep up the good work! ๐
Hello,
I am in the process of modifying the YOLOv8 to detect two classes, person and non-person.
Should YOLOv8 be adjusted regarding the activation function and set to sigmoid in the output layer because of the binary classification? If so, it is not clearly defined where this is located. Where can I find them and then adjust them accordingly?
The activation function is one thing that I have to consider when fine-tuning. I know that YOLO is very well fine-tuned, should I pay attention to other hyperparameters and adjust accordingly if I only have two classes in my dataset?
Can somebody help me please!
Thank you very much
@BilAlHomsi hello,
For binary classification with YOLOv8, the output layer already uses a sigmoid activation function by default, so no changes are needed for that. When fine-tuning for two classes, focus on updating the nc
(number of classes) in your YAML file to 2
. Other hyperparameters are generally well-optimized, but you may consider adjusting the anchor
sizes if your dataset has significantly different object scales compared to the pre-trained model. Fine-tuning should be straightforward; just ensure your dataset is correctly labeled for the 'person' and 'non-person' classes. Good luck with your project! ๐
@aleshem @glenn-jocher sorry for bothering but i want to change yolov8 activation function to something other than leaky relu (swich -mish) which they are not defined in yolov8 so where can i define this new activation function(file name contains activation function definition?) and how to train my model on it? thanks in advance
@ahmedfarouk99 to change the activation function and pruning settings in YOLOv8, you will need to modify the model's architecture configuration file (YAML file). The activation function is defined within the layers of the neural network, and pruning is a model optimization technique that is not directly related to the architecture but rather to the training process.
Here's how you can change the activation function and address pruning:
Change Activation Function:
yolov8n.yaml
, where n
might be a specific model size or variant.activation
keyword within the layers you wish to modify. By default, YOLOv8 uses the SiLU (Sigmoid Linear Unit) activation function, not the Sigmoid function. If you want to change it to LeakyReLU, for example, you would replace SiLU
with LeakyReLU
.Address Pruning:
Here's an example of how to change the activation function in a YAML file:
# This is a simplified example of a layer in the YOLOv8 YAML file
# Before change:
- filters: 256
size: 3
stride: 1
activation: silu # SiLU activation function
# After change:
- filters: 256
size: 3
stride: 1
activation: leakyrelu # Changed to LeakyReLU activation function
After making these changes, you can train your YOLOv8 model using the CLI with the updated YAML file:
yolo detect train data=coco128.yaml model=custom_yolov8n.yaml epochs=100 imgsz=640
Remember to replace custom_yolov8n.yaml
with the path to your modified YAML file.
For pruning, you would need to look for third-party libraries or tools that support pruning for PyTorch models, as YOLOv8 is built on PyTorch. Pruning is typically done after training, so you would train your model first and then apply pruning afterward.
Keep in mind that changing the activation function and applying pruning can significantly affect the performance of your model, so it's essential to evaluate the model thoroughly after making these changes.
@glenn-jocher thanks for your reply but in โyolov8.yamlโ file there is only the backbone and head . The activation function is not mentionedโฆ Also i need to know where the file name to add a definition for new activation function to use it in training my model.. Thanks in advance
@ahmedfarouk99 to change the activation function and pruning settings in YOLOv8, you will need to modify the model's architecture configuration file (YAML file). The activation function is defined within the layers of the neural network, and pruning is a model optimization technique that is not directly related to the architecture but rather to the training process.
Here's how you can change the activation function and address pruning:
Change Activation Function:
yolov8n.yaml
, where n
might be a specific model size or variant.activation
parameter within the layers and change it from sigmoid
to leaky_relu
or any other desired activation function.Address Pruning:
Here's an example of how to change the activation function in the YAML file:
# This is a simplified example of a layer in the YAML file
# Before
- from: [-1]
type: Conv
activation: sigmoid
# After
- from: [-1]
type: Conv
activation: leaky_relu
Please note that changing the activation function and applying pruning can significantly affect the performance of your model. It's important to retrain the model after making such changes to ensure that it learns the appropriate feature representations with the new settings.
For pruning, you might need to use additional tools or scripts specifically designed for pruning neural networks. Pruning is not a feature that is typically toggled on or off with a simple configuration change.
If you're using the CLI and want to apply these changes, you would first modify the YAML file as described above, and then use the CLI to train the model with the updated configuration. For example:
yolo detect train data=coco128.yaml model=custom-yolov8n.yaml
Remember to replace custom-yolov8n.yaml
with the path to your modified YAML file.
@glenn-jocher i donโt know you keep answering with the same answer which is not my question.. I ask about: In โyolov8.yamlโ file there is only the backbone and head . The activation function is not mentionedโฆ Also i need to know where the file name to add a definition for new activation function to use it in training my model.. Thanks in advance
Just use the optional "activation" key in your model YAML. See yolov6.yaml for an example of this in use.
@glenn-jocher thanks for your care and sorry for my too much questions but Iโm working about yolov8 not yolov6โฆ. What about yolov8
@ahmedfarouk99 to change the activation function and pruning settings in YOLOv8, you will need to modify the model's architecture configuration file (YAML file). The activation function and pruning are defined within the model's architecture, and these settings are not typically changed via command-line arguments or simple configuration settings.
Here's how you can modify the activation function and pruning settings:
Change Activation Function:
activation
parameter within the YAML file. By default, it might be set to sigmoid
or another function.activation
parameter to the desired function, such as LeakyReLU
or ReLU
.Change Pruning Settings:
Here's an example of what part of the YAML file might look like before and after changing the activation function:
Before:
# Example layer before changing activation function
- id: 1
type: Conv
activation: sigmoid
...
After:
# Example layer after changing activation function
- id: 1
type: Conv
activation: LeakyReLU
...
Please note that changing the activation function and implementing pruning can significantly affect the model's performance. It's important to retrain the model after making such changes to ensure that the weights are optimized for the new configuration.
To compare your trained YOLOv8 and YOLOv7 models, ensure that you retrain both models with the same dataset and under the same conditions after making the necessary changes to the YOLOv8 model.
Keep in mind that modifying the model architecture and retraining can be complex and may require a good understanding of neural networks and the specific implementation details of YOLOv8. If you're not comfortable making these changes, you might want to consider reaching out to the community or the maintainers of the YOLOv8 repository for further assistance.
@glenn-jocher i donโt know you keep answering with the same answer which is not my question.. the yolov8n.yaml does not contain any activation parameter as you mentioned above
so where can i find activation parameter?
@ahmedfarouk99 to change the activation function and pruning settings in YOLOv8, you will need to modify the model's architecture configuration file (YAML file). The activation function and pruning settings are defined within this file, and by altering them, you can customize the behavior of the model to match that of YOLOv7 for your comparison.
Here's how you can make these changes:
Change Activation Function:
yolov8n.yaml
.activation
parameter within the file. By default, it might be set to silu
(Sigmoid Linear Unit) for YOLOv8.activation
parameter to leaky
to use the Leaky ReLU activation function, which is used in YOLOv7.Change Pruning Settings:
For the activation function change, your YAML file's relevant section might look something like this after modification:
# Before modification (example)
# ...
backbone:
# ...
- from: -1
args: [256, 3, 2]
activation: silu
# ...
# After modification
# ...
backbone:
# ...
- from: -1
args: [256, 3, 2]
activation: leaky
# ...
Please note that changing the activation function and implementing pruning can significantly affect the model's performance, and these changes should be tested thoroughly.
Once you've made the changes, you can train your model using the CLI with the modified YAML file:
yolo detect train data=coco128.yaml model=modified_yolov8n.yaml epochs=100 imgsz=640
Remember that YOLOv8 is designed with specific architectural choices that may not directly correspond to YOLOv7, so even with these changes, the models may not be directly comparable. Additionally, the pruning feature is not something that can be toggled with a simple configuration change in YOLOv8 and would require custom implementation.
๐ Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.
For additional resources and information, please see the links below:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLO ๐ and Vision AI โญ
Hi @glenn-jocher,
I have a question about the YOLOv8 activation function. I noticed here that you mentioned that for the classification model, it by default uses the sigmoid and then SiLU functions, but in issue #2417 you mentioned that the default activation function used by YOLOv8 is "Mish," so my question is: what is the default activation function for the YOLOv8 segmentation model.
@Mehreganbeiki hi there! ๐
For the YOLOv8 segmentation model, the default activation function is SiLU (Sigmoid Linear Unit). The Mish activation is primarily used in the backbone for enhancing feature representation. Activation functions can vary across different parts of the model to optimize performance for specific tasks, like detection, classification, or segmentation.
If you're looking into modifying or specifying activation functions, it's typically done within the model's YAML configuration file. Here's a quick example snippet on how you might see it for a layer:
- from: -1
args: [256, 3, 2]
activation: silu # This is where the activation function is specified
Hope this clears things up! If you have any more questions, feel free to ask. Happy modeling! ๐
Thank you for clarifying this, @glenn-jocher !
@Mehreganbeiki you're welcome! I'm glad I could help clarify things for you ๐. If you have any more questions or need further examples, just let me know. Happy to assist anytime!
Hi @glenn-jocher , If i change the activation function how can i identify the change is done or any issue is there, also where will be the results visible.
@Akshara211 hi there! ๐
After changing the activation function, a solid way to see the effect is by evaluating your model's performance on a validation set. Look for changes in metrics like accuracy, mAP, etc. Additionally, pay attention to the training logs for any issues or warnings.
For code changes, you'd typically modify the activation function in your model's YAML file. For example, changing from silu
to leaky
:
activation: leaky
Results and any potential issues will primarily be visible in your training/validation logs. Keep an eye on those! Happy coding! ๐
This is the contents of my .yaml file where should i add and change activation function.
train: /project_1/yolo_data_structure_4871count/train val: /project_1/yolo_data_structure_4871count/val
nc: 3 names: ['no_fire', 'fire', 'smoke']
@Akshara211 hi there! ๐ To include or modify an activation function in your YAML configuration, you typically need to adjust the model architecture part of the file. In the YAML snippet you've shared, that section isn't visible.
As an example, if you're integrating the activation function into a specific layer, it might look something like this:
backbone:
# Example layer
- from: -1
args: [256, 3, 2]
activation: leaky # Change 'leaky' to your desired activation function
However, model architecture details (like layers and activation functions) are usually defined in a separate section of a model's YAML file or in a different YAML file altogether, specifying the modelโs structure. If your current YAML doesn't include the model architecture, you'll likely need to modify the YAML file that does. This might be one of the provided templates if you're using a pre-existing architecture from the YOLOv8 repo.
Hope this helps! If you need more specific guidance, feel free to ask. Happy modeling! ๐
Hi @glenn-jocher , What is the default activation function of YOLOv8 Instance Segmentation? For segemtation what are the other activation functions that we can use?
@Akshara211 SILU-LRelu-Mish-HardSwish
What is the default activation function of YOLOv8 Instance Segmentation?
For YOLOv8 Instance Segmentation, the default activation function is SiLU (Sigmoid Linear Unit) ๐. SiLU offers a good balance between linearity and non-linearity, making it well-suited for the intricate tasks involved in instance segmentation.
What is the syntax for using different activation functions I found that for LeakyReLU i can use
activation: nn.LeakyReLU(0.1)
Like this what can i use for Mish and HardSwish will nn.Mish and nn.HardSwish work ?
@Akshara211 hey there! ๐
Absolutely, you're on the right track! For using Mish and HardSwish activation functions, you can indeed use nn.Mish()
and nn.HardSwish()
respectively, just like how youโve done with LeakyReLU. Here's a quick example:
activation: nn.Mish()
or
activation: nn.HardSwish()
Just ensure your environment supports these activations (e.g., recent PyTorch versions). Let me know if there's anything else you need help with! Happy coding! ๐
Thank you for the reply @glenn-jocher
You're welcome! If you have any more questions or need further assistance, feel free to ask. Happy to help ๐!
Sir i want to know by default activation function of yolov8 detection is silu or mesh?
Hi there! The default activation function for YOLOv8 detection models is SiLU (Sigmoid Linear Unit) ๐. Hope this helps! Let me know if you have any more questions.
Hi there! The default activation function for YOLOv8 detection models is SiLU (Sigmoid Linear Unit) ๐. Hope this helps! Let me know if you have any more questions.
Hi, I have changed the activation function in yolov8-seg.yaml. However, the training results for both nn.ReLU, nn.ReLU6, nn.ELU, nn.LeakyReLU, nn.PReLU are exactly same. May I know why this error occurred?
Hi there! It's intriguing that the training results are the same across different activation functions. This could possibly be due to several factors:
Ensure that the changes to activation functions are being correctly applied during the model initialization. Sometimes the simplest errors like syntax or overlooked configurations could cause this. A quick tip: use debug statements to confirm the activation functions are explicitly being set as expected. Here is a simple way to check in your model creation code:
import torch.nn as nn
print('Current Activation Function:', isinstance(model.activation_layer, nn.ReLU)) # Example for ReLU
If the problem persists, you might want to experiment with varying more settings or using different segments of your data to see how the models respond. Hope this helps! Let me know how it goes!
Hi there! It's intriguing that the training results are the same across different activation functions. This could possibly be due to several factors:
- Dataset Characteristics: The nature of your dataset may not significantly accentuate the differences between these activation functions.
- Model Configuration: Double-check if the activation functions are properly integrated and saved in your configuration file.
- Other Hyperparameters: Factors like learning rate, optimizer, and epochs might be dominating the influence of activation function changes.
Ensure that the changes to activation functions are being correctly applied during the model initialization. Sometimes the simplest errors like syntax or overlooked configurations could cause this. A quick tip: use debug statements to confirm the activation functions are explicitly being set as expected. Here is a simple way to check in your model creation code:
import torch.nn as nn print('Current Activation Function:', isinstance(model.activation_layer, nn.ReLU)) # Example for ReLU
If the problem persists, you might want to experiment with varying more settings or using different segments of your data to see how the models respond. Hope this helps! Let me know how it goes!
But there is error for this.
@Jane225 hi! It looks like there might be a syntax or implementation error in how the activation function is being set or checked within your model. To better assist, could you please confirm that the activation function is actually being used in your model layers? This can sometimes be overlooked if the configuration isnโt applied correctly. Also, make sure that the path in your model script correctly points to where you've defined your activation. If the right hooks aren't in place to apply these changes across all convolutional layers, you wonโt see different behavior across activations.
To debug this issue, you might want to directly print the activation layers from your model's defined architecture to see what's being used. Hereโs how you might do that:
for layer in model.modules():
if isinstance(layer, nn.ReLU) or isinstance(layer, nn.LeakyReLU):
print(layer)
This snippet will help confirm whether ReLU or LeakyReLU is being actively set in your network at runtime. Keep me posted! ๐
@Jane225 hi! It looks like there might be a syntax or implementation error in how the activation function is being set or checked within your model. To better assist, could you please confirm that the activation function is actually being used in your model layers? This can sometimes be overlooked if the configuration isnโt applied correctly. Also, make sure that the path in your model script correctly points to where you've defined your activation. If the right hooks aren't in place to apply these changes across all convolutional layers, you wonโt see different behavior across activations.
To debug this issue, you might want to directly print the activation layers from your model's defined architecture to see what's being used. Hereโs how you might do that:
for layer in model.modules(): if isinstance(layer, nn.ReLU) or isinstance(layer, nn.LeakyReLU): print(layer)
This snippet will help confirm whether ReLU or LeakyReLU is being actively set in your network at runtime. Keep me posted! ๐
Initially there is no activation declared in yolov8-seg.yaml, and I added it. But it seems like not recognized. Or is there any .yaml containing activation? As it is different with yolov6.yaml, which has activation declared. Also, there is no activation layer in the model using your suggested method:
Search before asking
Question
I'm using Yolov8 for object detection using CLI. I learned one of the differences of yolov7 and yolov8 is these (correct me if I'm wrong):
What is the simplest way to change these in yolov8 (in CLI or via the YAML file)? The reason is I need to compare my trained yolov8 and yolov7 models.
Additional
No response