Closed pfeatherstone closed 4 years ago
Hello @pfeatherstone, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook , Docker Image, and Google Cloud Quickstart Guide for example environments.
If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.
If this is a custom model or data training question, please note that Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:
For more information please visit https://www.ultralytics.com.
@pfeatherstone ah interesting observation. Can you upload a few examples (i.e. train_batch0.jpg) of the augmentation and resizing techniques you found to work best?
We are careful to maintain aspect ratio when resizing, and to ensure the proper resizing algorithms to avoid aliasing etc.
We were unfortunately not able to gain mAP using rotation and shearing operations in our COCO experiments, but did find translation and scaling to help.
Here is yolov5s inference using letterbox resizing and input dimension 640x640 (the default)
Here is the yolov5s inference using normal resizing and input dimension 640x*640
So you can see that without letterbox resizing, accuracy goes down and the boxes are as tight.
Is there any reason why you might want to preserve aspect ratio? I haven't trained yolov5 models yet, but when i used your yolov3-spp model i added a bunch of albumentations augmentation transformations (like 15+ possible transformations) to make it as resilient as possible, which worked pretty well.
Here is the albumentations composition i used:
import albumentations as albu
albu.Compose([albu.OneOf([albu.RandomBrightnessContrast(brightness_limit=0.3, contrast_limit=0.3, p=0.7),
albu.RandomGamma(gamma_limit=(50, 150), p=0.7),
albu.RGBShift(p=0.7),
albu.HueSaturationValue(hue_shift_limit=10, sat_shift_limit=15, val_shift_limit=10, p=0.7),
albu.CLAHE(p=0.7),
albu.ImageCompression(quality_lower=30),
albu.GaussNoise(p=0.7),
albu.GaussianBlur(p=0.7),
albu.MedianBlur(p=0.7),
albu.ChannelShuffle(p=0.7),
albu.CoarseDropout(p=0.7),
albu.Equalize(p=0.7),
albu.FancyPCA(p=0.7),
albu.IAAEmboss(p=0.7),
albu.IAASharpen(p=0.7),
albu.ISONoise(p=0.7),
albu.Posterize(p=0.7),
albu.InvertImg(),
albu.MotionBlur(always_apply=True),
albu.RandomRain(),
albu.RandomShadow(),
albu.RandomSnow(),
albu.Solarize()]),
albu.OneOf([albu.VerticalFlip(p=0.2),
albu.HorizontalFlip(),
albu.Transpose(p=0.2),
albu.ShiftScaleRotate()])],
bbox_params=albu.BboxParams(format='coco', label_fields=['category_id']))
I thought albu.ShiftScaleRotate
scaled differently in both spatial dimensions, i don't think it does actually.
So there wasn't an aggressive enough diversity in aspect ratios. Oh well.
@pfeatherstone that's definitely a substantial amount of augmentation. You should be careful though, some methods like CLAHE (contrast limited adaptive histogram equalization) are used to enhance image contrast, and since this will not be applied during testing, introducing this during training may harm your test results.
If you do find a combination of augmentation parameters that outperform the default on COCO training please let us know though, this would be very useful to update our defaults with.
The augmentation was helpful on my custom dataset where there wasnβt a huge amount of diversity. On COCO, I doubt that much augmentation is necessary. However, the degraded performance of yolov5 when not using letterbox resizing suggests that some scaling augmentation on top of mosaic would be beneficial.
Unfortunately I donβt have a lot of time to train coco models at the moment. But In case you were looking for some ideas, scaling augmentation might be a good one.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
@glenn-jocher Just noticed that this was closed. Don't you think that non-letterbox resizing issues impacting accuracy should be looked at? This isn't a problem with the original yolov3 and yolov3-spp models. Stretching could maybe be a form of augmentation to avoid overfitting to specific aspect ratios...
@pfeatherstone stretching produced worse results in our experiments, which is why we do not use it.
Hi @glenn-jocher , I wonder why do you use letterbox padding instead of just padding to the bottom right?
@hiyyg symmetric padding allows for reduced edge effects vs unilateral padding.
@pfeatherstone see PR #3882 for a proposed automatic Albumentations integration.
@hiyyg @pfeatherstone good news π! Your original issue may now be fixed β
in PR #3882. This PR implements a YOLOv5 π + Albumentations integration. The integration will automatically apply Albumentations transforms during YOLOv5 training if albumentations>=1.0.0
is installed in your environment.
To use albumentations simply pip install -U albumentations
and then update the augmentation pipeline as you see fit in the Albumentations
class in yolov5/utils/augmentations.py
. Note these Albumentations operations run in addition to the YOLOv5 hyperparameter augmentations, i.e. defined in hyp.scratch.yaml.
class Albumentations:
# YOLOv5 Albumentations class (optional, used if package is installed)
def __init__(self):
self.transform = None
try:
import albumentations as A
check_version(A.__version__, '1.0.0') # version requirement
self.transform = A.Compose([
A.Blur(p=0.1),
A.MedianBlur(p=0.1),
A.ToGray(p=0.01)],
bbox_params=A.BboxParams(format='yolo', label_fields=['class_labels']))
logging.info(colorstr('albumentations: ') + ', '.join(f'{x}' for x in self.transform.transforms))
except ImportError: # package not installed, skip
pass
except Exception as e:
logging.info(colorstr('albumentations: ') + f'{e}')
def __call__(self, im, labels, p=1.0):
if self.transform and random.random() < p:
new = self.transform(image=im, bboxes=labels[:, 1:], class_labels=labels[:, 0]) # transformed
im, labels = new['image'], np.array([[c, *b] for c, b in zip(new['class_labels'], new['bboxes'])])
return im, labels
Example train_batch0.jpg
on COCO128 dataset with Blur, MedianBlur and ToGray. See the YOLOv5 Notebooks to reproduce:
To receive this YOLOv5 update:
git pull
from within your yolov5/
directory or git clone https://github.com/ultralytics/yolov5
againmodel = torch.hub.load('ultralytics/yolov5', 'yolov5s', force_reload=True)
sudo docker pull ultralytics/yolov5:latest
to update your image Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 π!
I've realised playing with these models that performance is degraded when using regular resizing vs letterbox resizing. I would suggest not training with letterbox resizing, and instead add some augmentation whereby images are stretched, rotated, cropped, etc during training.