ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.69k stars 16.33k forks source link

About image augmentation and albumetation in YOLOV5 #12134

Closed Nangsaga closed 12 months ago

Nangsaga commented 1 year ago

Search before asking

Question

Hello Glenn Jocher,

I would like to ask about the image augmentation and albumetation in yolov5. As you mentioned in https://github.com/ultralytics/yolov5/discussions/10469, image augmentation will add 3 images to each original one. So I can say the number of training images of the augmented dataset are increased 4 folds compared to the original dataset. And how about the image albumentation? How many additional images are created from the original one?

In my case, I prepared a dataset of 17 classes, about 2000 images per class. I used all these images for train dataset, and randomly picked about 500 image from 2000 images/class to make validation dataset. I did not use test dataset. I tried to execute train.py with and without --hyp hyp.scratch-low.yaml (with default parameters) and compared the training results. However, I did not see any differences between them. So the augmentation did not affect the training result in my case? Or my preparation of train set and val set was not good?

Thank you.

Additional

No response

glenn-jocher commented 1 year ago

@Nangsaga hi there,

Regarding image augmentation in YOLOv5, it is correct that the default augmentation strategy in YOLOv5 involves adding three additional augmented images for each original image, resulting in a fourfold increase in the number of training images.

As for Albumentations, it is a popular Python library for image augmentation that can be easily integrated with YOLOv5. However, the number of additional images created from the original one using Albumentations depends on the specific augmentations applied and their parameters. You can customize the augmentation pipeline in Albumentations to fit your needs and generate as many additional images as desired.

In regards to your specific scenario, where you prepared a dataset of 17 classes with about 2000 images per class, and used 500 randomly selected images per class for validation, it is possible that the augmentation did not have a noticeable effect on the training results. It is also worth considering other factors such as the quality of the annotations, class distribution, and the choice of hyperparameters.

Feel free to share more details about your dataset and training setup if you need further assistance.

Best regards, Glenn Jocher

Nangsaga commented 1 year ago

Thanks Glenn for your clear explain. I compared the training results for each epoch and found no difference between the below two training commands. So I am afraid that there was something wrong with 2) leading to non-augmentation of training images. Could you pls check these for me? Anyway I obtained high mAP@0.5 of 0.984 for all classes.

1) without image augmentation python train.py --data data.yaml --cfg yolov5m.yaml --weights yolov5m.pt --batch-size 4 --epochs 1000 --optimizer AdamW --cache --freeze 10 2) with image augmentation python train.py --data data.yaml --cfg yolov5m.yaml --weights yolov5m.pt --batch-size 4 --epochs 1000 --optimizer AdamW --cache --freeze 10 --hyp hyp.scratch-low.yaml

By the way, when training with image augmentation, the labels graph will show the instances of both original and augmented images or still for only original ones?

glenn-jocher commented 1 year ago

Hi there,

Glad to hear that you found my explanation helpful. Regarding your comparison of the training results, it's interesting that you didn't observe any difference between the two training commands. One possibility is that the augmentation may not have had a significant impact on the performance of your specific dataset and training setup. Achieving a high mAP @Nangsaga.5 of 0.984 for all classes is impressive!

As for your question about the labels graph during training with image augmentation, the graph will display the instances of both the original and augmented images. This is because YOLOv5 applies the augmentation transformations to both the input image and its corresponding bounding boxes/labels.

If you have any further questions or need assistance with anything else, feel free to ask.

Best regards, Glenn Jocher

Nangsaga commented 1 year ago

Hi there, Accordingly, there must be something missed so that the image augmentation could not be activated because the labels graphs resulted from the above commands are exactly the same. Do I have to modify the values of some parameters in the hyp.scratch-low.yaml to activate the image augmentation?

the graph will display the instances of both the original and augmented images.

glenn-jocher commented 1 year ago

@Nangsaga hi,

Regarding your question about activating image augmentation in YOLOv5, it seems like there might be a configuration issue as the labels graph resulting from the commands you mentioned appears to be the same. To activate image augmentation, you might need to modify the values of certain parameters in the hyp.scratch-low.yaml file. I recommend checking the augmentation-related parameters in the YAML file and ensuring that they are properly set to enable the desired augmentation transformations.

Concerning the labels graph, it will indeed display the instances of both the original and augmented images during training. This allows you to visualize the effects of augmentation on the bounding boxes/labels.

I hope this information helps! Let me know if you have any further questions.

Glenn Jocher

Nangsaga commented 1 year ago

Thank you for your confirmation. I will try and report later.

glenn-jocher commented 1 year ago

@Nangsaga great! Please feel free to reach out if you encounter any further issues or have any additional questions. We're here to help.

Nangsaga commented 1 year ago

Hi there, Finally, I know how to enable and disable (hyp.no-augmentation.yalm) image augmentation in YOLOV5 and to verify the data augmentation by seeing sampled images of trainbatch instead of checking labels graph. By the way, Can I use confidence score as a metrics to evaluate the object detection accuracy when using YOLOV5? Or I must use accuracy calculated by confusion matrix?

glenn-jocher commented 1 year ago

@Nangsaga yes, you can use the confidence score as a metric to evaluate the object detection accuracy when using YOLOv5. The confidence score represents the model's confidence in the presence of an object in a particular bounding box. By setting a minimum threshold for the confidence score, you can control the detection precision and minimize false positives. However, it is important to note that the confidence score alone may not provide a comprehensive evaluation of the model's performance.

For a more detailed assessment, it is recommended to use accuracy metrics calculated from the confusion matrix. The confusion matrix considers both true positives, true negatives, false positives, and false negatives, providing a comprehensive view of the model's performance. This allows you to evaluate metrics such as precision, recall, F1 score, and mAP (mean Average Precision) which are commonly used in object detection tasks.

I hope this information helps! Let me know if you have any further questions.

Glenn Jocher

Nangsaga commented 1 year ago

Thank you for your kind explanation. I saved confidence scores when running detect.py. Using the saved data, can I calculate the accuracy? In the below sample data, I only know the first column and last column as class number and confidence score.

`1 0.849609 0.507639 0.222656 0.181944 0.954632

3 0.246875 0.469444 0.229687 0.294444 0.960332

2 0.553906 0.463889 0.314063 0.497222 0.971136 `

glenn-jocher commented 1 year ago

@Nangsaga yes, you can use the saved confidence scores to calculate metrics like accuracy, but it's important to note that accuracy alone may not provide a comprehensive evaluation of object detection performance.

To calculate accuracy, you would typically compare the predicted class (the class with the highest confidence score) against the ground truth class. If the predicted class matches the ground truth class, it is considered a correct prediction. You can calculate the overall accuracy by dividing the number of correct predictions by the total number of predictions.

However, object detection tasks often require evaluating metrics like precision, recall, F1 score, and mAP (mean Average Precision) to get a better understanding of model performance. These metrics take into account factors like true positives, true negatives, false positives, and false negatives, providing a more comprehensive evaluation.

If you have access to the ground truth labels, you can evaluate these metrics using the confidence scores and the corresponding ground truth classes for each prediction.

I hope this answers your question. Let me know if you need further assistance.

Glenn Jocher

Nangsaga commented 1 year ago

Thank you very much. Could you advise me the names of columns 2-5 of the saved data as provided in the previous thread (col. 1: class No., col. 6: conf. score). I want to calculate the accuracy when deploying the model in the real world. As you explained above, model could only be overall evaluated when training with validation dataset? Or I can calculate the evaluating metrics like precision, recall, F1 score, and mAP during a real world deployment?

glenn-jocher commented 1 year ago

@Nangsaga you can use the following column names for the saved data:

Regarding evaluating metrics during real-world deployment, it is possible to calculate metrics like precision, recall, F1 score, and mAP using the confidence scores and ground truth labels for each prediction. These metrics provide a comprehensive evaluation of the model's performance in object detection tasks.

However, it's important to note that real-world deployment may present additional challenges compared to model training and validation. Factors such as different environments, lighting conditions, and potential variations in the target objects may affect the performance of the model. Therefore, it is recommended to carefully evaluate and fine-tune the model for the specific real-world deployment scenario.

If you have any further questions or need additional assistance, feel free to ask.

Glenn Jocher

github-actions[bot] commented 1 year ago

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐