Closed Petros626 closed 1 week ago
π Hello @Petros626, thank you for your interest in Ultralytics π! We recommend visiting the Docs for detailed explanations on key topics like optimization and hyperparameters, which might address similar questions. You can also find helpful Python and CLI usage examples there.
If your query is a β Question, providing further details, such as relevant coding context or how you are observing this behavior during training, may help us offer better assistance.
If this turns out to be related to a π Bug, we kindly request a minimum reproducible example. This will allow us to debug and understand the issue more effectively.
Ready to connect with the Ultralytics community or share ideas? You can use these platforms:
If youβre using an older version of Ultralytics, try upgrading to the latest version to ensure you have the latest fixes and improvements:
pip install -U ultralytics
The optimizer details youβve shared might be influenced by configurations from the latest release. Full requirements are available for review, and we recommend using Python>=3.8 in combination with PyTorch>=1.8.
YOLO can run in various cloud-based environments with pre-installed dependencies:
For advanced workflows, explore options like Docker , or use AWS π and GCP guides for enterprise-scale projects.
This badge indicates the current build status of all Ultralytics CI tests. CI continuously ensures the correct operation of YOLO Modes and Tasks across diverse setups on macOS, Windows, and Ubuntu.
π‘ Please note, this is an automated response. An Ultralytics engineer will follow up with you soon to provide additional assistance.
@Petros626 the Ultralytics YOLO optimizer applies weight decay (L2 regularization) only to convolutional and linear layer weights, excluding biases and normalization layers (BatchNorm) to maintain their distribution stability. This selective approach enhances training performance. For implementation details, see the build_optimizer
source code that groups parameters into weight decay/no-decay categories.
Now it's clear, thank you.
@Petros626 you're welcome! If you have any more questions as you dive deeper into your project, feel free to ask. Happy training with Ultralytics YOLO! π
Search before asking
Question
Hey,
This is a logger output before the training starts:
optimizer: Adam(lr=0.01, momentum=0.937) with parameter groups 63 weight(decay=0.0), 73 weight(decay=0.0005), 72 bias(decay=0.0)
How it is decided which parameters receive weight decay? Are only those parameter groups taken that are useful for the training?
Thanks for clarification!
Additional
No response