ultralytics / yolov5

YOLOv5 πŸš€ in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
50.22k stars 16.21k forks source link

Mosaic augmentation and normalization #3358

Closed yannclaes closed 3 years ago

yannclaes commented 3 years ago

❔Question

Hi, I've read that YOLOv5 implemented mosaic augmentation, however I cannot seem to find where it performs so. I have looked into yolo.py but only saw scales and flips.

Also, do you perform any image normalization before feeding it into the backbone? If so, what are the mean and std vectors being used?

Additional context

glenn-jocher commented 3 years ago

@yannclaes see the YOLOv5 trainloader here in datasets.py: https://github.com/ultralytics/yolov5/blob/407dc5008e47b1aad5ce69f0c91b4f1ec321dd7f/utils/datasets.py#L347

Mosaics are created here: https://github.com/ultralytics/yolov5/blob/407dc5008e47b1aad5ce69f0c91b4f1ec321dd7f/utils/datasets.py#L674-L727

yannclaes commented 3 years ago

Thanks! So it doesn't seem like you are performing image normalization before processing them into the model, does it?

glenn-jocher commented 3 years ago

@yannclaes images are rescaled from 0-255 to 0-1, but no they are not normalized (zero mean unity variance).

github-actions[bot] commented 3 years ago

πŸ‘‹ Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 πŸš€ resources:

Access additional Ultralytics ⚑ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 πŸš€ and Vision AI ⭐!

doo5643 commented 2 years ago

@yannclaes images are rescaled from 0-255 to 0-1, but no they are not normalized (zero mean unity variance).

@glenn-jocher Why didn't you add zero mean unity variance, which is commonly used in other training scripts? For example, mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225] in Imagenet case.

glenn-jocher commented 2 years ago

@doo5643 doesn't help. If you find otherwise though please let us know.