ultralytics / ultralytics

Ultralytics YOLO11 πŸš€
https://docs.ultralytics.com
GNU Affero General Public License v3.0
31.02k stars 5.97k forks source link

YoloV8 Augmentation for train (task=classify) #4768

Closed frxchii666 closed 1 year ago

frxchii666 commented 1 year ago

Search before asking

Question

Hi there! How can i enable augmentations for my training(classification)? I didn't find info about augmentations in classification task. Thanks for answer!

Additional

No response

github-actions[bot] commented 1 year ago

πŸ‘‹ Hello @frxchii, thank you for your interest in YOLOv8 πŸš€! We recommend a visit to the YOLOv8 Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a πŸ› Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Join the vibrant Ultralytics Discord 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.

Install

Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

Ultralytics CI

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

glenn-jocher commented 1 year ago

@frxchii hello and thank you for your question!

The classification task in YOLOv8 does indeed support various data augmentations. Augmentations can significantly improve the performance of the model by presenting more diverse data during training, including modifications in color, scale, perspective, and more.

To enable augmentations, you'll need to modify the YAML configuration file (*.yaml) used during training. This file determines various aspects of the training process, including data sources, model settings, and yes, augmentations.

The augmentations are located under the augment section in the YAML file. Here, you can enable or disable different types of augmentations. For example, if you'd like to apply horizontal flips to your images, you would enable the flipud (for up-down flipping) or fliplr (for left-right flipping) settings.

Remember that the specific choices of augmentations will depend on the nature of your data. It's crucial to understand how these modifications can help or harm the learning process given your specific task and dataset.

Please don't hesitate to ask if you have any further questions. Have a great day and happy coding!

michaeldeyzel commented 12 months ago

@glenn-jocher Since you do not specify a YAML file for task=classify (you simply point to the dataset directory), how do you specify augmentations?

glenn-jocher commented 12 months ago

Hello @michaeldeyzel, thank you for reaching out!

In YOLOv8, the training process and its parameters, including dataset and augmentations, are indeed determined through a YAML configuration file. However, for simpler use-cases such as image classification, where only a dataset directory is specified, the framework applies a set of default augmentations.

These defaults are pre-configured to be generally effective across a wide range of tasks, meant to help with the model generalization by creating variations in the input data. As of now, there's no explicit mechanism to manually specify different augmentations when using the simplified setup for classification.

If you wish to use custom augmentations for classification, you'd have to set up a YAML configuration file. This would involve structuring your data in a specific way and adjusting the related settings in the YAML file.

Do note that augmentations are a key aspect of training robust models. Yet, picking the right set of augmentations may be dependent on your specific data and task. Experimenting with different combinations could give you better insights into what works best for your use-case.

I hope this helps! Feel free to follow up if you have any more questions.

motidil commented 12 months ago

@glenn-jocher can you please include a reference for a valid augmentation-included classification conf yaml file?? I have tried this code with no luck:

from ultralytics import YOLO

TRAIN_FOLDER_PATH = "/home/ubuntu/classifier_train_data/all_data"

EPOCHS = 100
IMAGE_SIZE = 128
BATCH = 128
DEVICE = 0

TRAIN_CONFIG = dict(
    lr0=0.009,  # initial learning rate (i.e. SGD=1E-2, Adam=1E-3)
    lrf=0.01,  # final learning rate (lr0 * lrf)
    optimizer="SGD",
    dropout=0.3,
    momentum=0.937,  # SGD momentum/Adam beta1
    weight_decay=0.0005,  # optimizer weight decay 5e-4
    warmup_epochs=3.0,  # warmup initial momentum
    warmup_bias_lr=0.1,  # warmup initial bias lr
    dfl=1.5,  # dfl loss gain
    label_smoothing=0.0,  # label smoothing (fraction)
    patience=25,
    augment=True,
    hsv_h=0.015,  # image HSV-Hue augmentation (fraction)
    hsv_s=0.7,  # image HSV-Saturation augmentation (fraction)
    hsv_v=0.4,  # image HSV-Value augmentation (fraction)
    degrees=0.2,  # image rotation (+/- deg)
    translate=0.2,  # image translation (+/- fraction)
    scale=0.4,  # image scale (+/- gain)
    shear=0.1,  # image shear (+/- deg) from -0.5 to 0.5
    perspective=0.3,  # image perspective (+/- fraction), range 0-0.001
    flipud=0.0,  # image flip up-down (probability)
    fliplr=0.5,  # image flip left-right (probability)
    mosaic=0.0,  # image mosaic (probability)
    mixup=0.0,  # image mixup (probability)
    copy_paste=0.0,  # segment copy-paste (probability)
)
# load a pretrained model (recommended for training)
model = YOLO('yolov8n-cls.pt')

# Train the model
results = model.train(data=TRAIN_FOLDER_PATH,
                      epochs=EPOCHS,
                      imgsz=IMAGE_SIZE,
                      device=DEVICE,
                      batch=BATCH,
                      seed=42,
                      **TRAIN_CONFIG)
glenn-jocher commented 12 months ago

@motidil hello, thanks for reaching out!

In your YOLOv8 configuration, the dictionary TRAIN_CONFIG including the augmentation parameters seems to be appropriate. However, directly passing TRAIN_CONFIG to the model.train() may not apply these augmentation settings, as YOLOv8 expects these in the YAML configuration file, not as arguments to the train function.

As of now, YOLOv8 does not currently support passing augmentation parameters through the model.train() directly, but only via the YAML configuration file.

The idea behind a YAML configuration file is to have a consistent and reusable setup for your experiments, which can be easily shared and replicated. In this file, you can specify your model architecture, data locations, hyperparameters and augmentation strategies among other things.

To specify augmentations in the YAML configuration file, you would do this under the augment field, where the various augmentation parameters are nested. You would then refer to this file when calling model.train().

I hope this clarifies things a bit. If you have further questions, don't hesitate to ask!

motidil commented 12 months ago

@glenn-jocher, please refer me to an example of YAML configuration file that includes augmentation... I can't find any reference in the docs. Also, do I need to add this YAML into the YOLO initialization object or to the train method via cfg parameter?

glenn-jocher commented 12 months ago

@motidil hello, and thank you for your question!

Unfortunately, as of now, there is no explicit example of a YAML configuration file with augmentations for YOLOv8 available in the documentation. However, I can provide a general explanation on how you can configure one.

The general structure of the YAML file should have specific sections dedicated to model, data, and training configuration. For augmentations, you typically would include these under a distinct section called 'augment'. Inside it, you specify different types of augmentations such as hsv_h, hsv_s, hsv_v, degrees, translate, scale and others.

Once the YAML file is setup, you utilise it in the training process by providing it to the train method using the cfg parameter. You do not need to incorporate this YAML file into the YOLO initialization object.

I understand configuring these files can be complex without a concrete example, and this is feedback we appreciate. We're constantly working on improving the documentation and we'll take your comments into account for future updates. Thanks for your understanding.

If you have any other questions, please don’t hesitate to ask!

motidil commented 11 months ago

@glenn-jocher thanks for the answer! I have tried to do what you suggest but Im still getting this error: Traceback (most recent call last): File "./train_yolov8_classifier.py", line 22, in <module> results = model.train(data=TRAIN_FOLDER_PATH, File "/home/ubuntu/ultralytics/lib/python3.8/site-packages/ultralytics/engine/model.py", line 336, in train self.trainer = (trainer or self._smart_load('trainer'))(overrides=args, _callbacks=self.callbacks) File "/home/ubuntu/ultralytics/lib/python3.8/site-packages/ultralytics/models/yolo/classify/train.py", line 39, in __init__ super().__init__(cfg, overrides, _callbacks) File "/home/ubuntu/ultralytics/lib/python3.8/site-packages/ultralytics/engine/trainer.py", line 83, in __init__ self.args = get_cfg(cfg, overrides) File "/home/ubuntu/ultralytics/lib/python3.8/site-packages/ultralytics/cfg/__init__.py", line 141, in get_cfg raise TypeError(f"'{k}={v}' is of invalid type {type(v).__name__}. " TypeError: 'augment={'hsv_h': 0.015, 'hsv_s': 0.7, 'hsv_v': 0.4, 'degrees': 0.2, 'translate': 0.2, 'scale': 0.5, 'shear': 0.015, 'perspective': 0.2, 'flipud': 0.0, 'fliplr': 0.5, 'mosaic': 0.0, 'mixup': 0.0, 'copy_paste': 0.0}' is of invalid type dict. 'augment' must be a bool (i.e. 'augment=True' or 'augment=False')

the yaml i used as you suggested is the following:

## yolov8-cls-train.yaml
# Ultralytics YOLO πŸš€, AGPL-3.0 license
# Default training settings and hyperparameters for medium-augmentation COCO training

# Train settings -------------------------------------------------------------------------------------------------------
epochs: 100  # (int) number of epochs to train for
patience: 25  # (int) epochs to wait for no observable improvement for early stopping of training
batch: 16  # (int) number of images per batch (-1 for AutoBatch)
imgsz: 128  # (int | list) input images size as int for train and val modes, or list[w,h] for predict and export modes
save: True  # (bool) save train checkpoints and predict results
save_period: -1 # (int) Save checkpoint every x epochs (disabled if < 1)
cache: False  # (bool) True/ram, disk or False. Use cache for data loading
device: 0  # (int | str | list, optional) device to run on, i.e. cuda device=0 or device=0,1,2,3 or device=cpu
workers: 8  # (int) number of worker threads for data loading (per RANK if DDP)
project:  # (str, optional) project name
name: yolov8n-cls-128X128_100epoch-augment-adam-lr00009  # (str, optional) experiment name, results saved to 'project/name' directory
exist_ok: False  # (bool) whether to overwrite existing experiment
pretrained: True  # (bool | str) whether to use a pretrained model (bool) or a model to load weights from (str)
optimizer: Adam  # (str) optimizer to use, choices=[SGD, Adam, Adamax, AdamW, NAdam, RAdam, RMSProp, auto]
verbose: True  # (bool) whether to print verbose output
seed: 42  # (int) random seed for reproducibility
deterministic: True  # (bool) whether to enable deterministic mode
single_cls: False  # (bool) train multi-class data as single-class
rect: False  # (bool) rectangular training if mode='train' or rectangular validation if mode='val'
cos_lr: False  # (bool) use cosine learning rate scheduler
close_mosaic: 0  # (int) disable mosaic augmentation for final epochs (0 to disable)
resume: False  # (bool) resume training from last checkpoint
amp: True  # (bool) Automatic Mixed Precision (AMP) training, choices=[True, False], True runs AMP check
fraction: 1.0  # (float) dataset fraction to train on (default is 1.0, all images in train set)
profile: False  # (bool) profile ONNX and TensorRT speeds during training for loggers
freeze: None  # (int | list, optional) freeze first n layers, or freeze list of layer indices during training
# Classification
dropout: 0.3  # (float) use dropout regularization (classify train only)

plots: True  # (bool) save plots during train/val

# Hyperparameters ------------------------------------------------------------------------------------------------------
lr0: 0.0009  # (float) initial learning rate (i.e. SGD=1E-2, Adam=1E-3)
lrf: 0.01  # (float) final learning rate (lr0 * lrf)
momentum: 0.937  # (float) SGD momentum/Adam beta1
weight_decay: 0.0005  # (float) optimizer weight decay 5e-4
warmup_epochs: 3.0  # (float) warmup epochs (fractions ok)
warmup_momentum: 0.8  # (float) warmup initial momentum
warmup_bias_lr: 0.1  # (float) warmup initial bias lr
augment:
  hsv_h: 0.015  # (float) image HSV-Hue augmentation (fraction)
  hsv_s: 0.7  # (float) image HSV-Saturation augmentation (fraction)
  hsv_v: 0.4  # (float) image HSV-Value augmentation (fraction)
  degrees: 0.2  # (float) image rotation (+/- deg)
  translate: 0.2  # (float) image translation (+/- fraction)
  scale: 0.5  # (float) image scale (+/- gain)
  shear: 0.015  # (float) image shear (+/- deg)
  perspective: 0.2  # (float) image perspective (+/- fraction), range 0-0.001
  flipud: 0.0  # (float) image flip up-down (probability)
  fliplr: 0.5  # (float) image flip left-right (probability)
  mosaic: 0.0  # (float) image mosaic (probability)
  mixup: 0.0  # (float) image mixup (probability)
  copy_paste: 0.0  # (float) segment copy-paste (probability)

and finally run this code

# load a pretrained model (recommended for training)
model = YOLO('yolov8n-cls.pt')
# Train the model
results = model.train(data=TRAIN_FOLDER_PATH,
                      cfg="/home/ubuntu/yolov8-cls-train.yaml")
glenn-jocher commented 11 months ago

@motidil hello and thank you for your message.

From the error you're encountering, it seems like the augment parameter is expecting a boolean (True or False), rather than a dictionary. The 'augment' flag is generally used to indicate whether to use augmentation strategies or not.

However, in the YAML file, you have correctly defined 'augment' under the main heading, and then added specific types of augmentations and their parameters as sub-items within it.

There could be a mismatch in how the 'augment' flag and the specific augmentations are passed and interpreted by the software. This is likely the root of the error.

Note that, while in your YAML file you have correctly defined augmentations as part of the hyperparameters, in the Ultralytics setup, the 'augment' flag is usually a top-level item, not nested under 'hyperparameters'.

I hope this helps! If there are more questions or if something isn't clear, feel free to ask.

Cheryl33990 commented 7 months ago

@glenn-jocher Sorry, I have two points to confirm about my understanding of the values inside the cfg:

Q1:Application of HSV augmentation For HSV augmentation, is it applied as a proportional enhancement on the original image? For example, is hsv_h enhanced by a factor of 1.015. (hsv_h = 0.015)

Q2:Is augmentation applied during testing? Or is it only applied during training and validation?

Thank you!

shinbehavior commented 2 months ago

@glenn-jocher Sorry, I have two points to confirm about my understanding of the values inside the cfg:

Q1:Application of HSV augmentation For HSV augmentation, is it applied as a proportional enhancement on the original image? For example, is hsv_h enhanced by a factor of 1.015. (hsv_h = 0.015)

Q2:Is augmentation applied during testing? Or is it only applied during training and validation?

Thank you!

  1. Yes, it is insensitivity from the original image, and this number shows how they can go up or down.
  2. Augmentation did not apply during testing, you launch inference on your images/video;
glenn-jocher commented 2 months ago

Hello @hagonata,

Regarding your questions:

  1. HSV augmentation values are indeed applied as proportional adjustments to the original image. For instance, hsv_h=0.015 means the hue can vary by Β±1.5% of its original value.

  2. Augmentations are only applied during training and validation phases to help the model generalize better. They are not applied during testing or inference.

If you have any further questions, feel free to ask!