ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
49.63k stars 16.11k forks source link

With mosaic augmentation, are only 3 of the 4 images shuffled? #1016

Closed bsugerman closed 3 years ago

bsugerman commented 3 years ago

Looking at the create_dataloader and load_mosaic functions, it looks like the order of the first image in each mosaic never changes, while the 2-4th images are random choices from the full dataset. In other words, the first mosaic uses image[0] in the image data list and then 3 random ones, the 2nd mosaic uses image[1] and 3 random ones, etc. However, the ordering of the image list is never shuffled. Is that correct?

glenn-jocher commented 3 years ago

@bsugerman I looked into the code and yes, your interpretation is correct.

If you wanted to shuffle image 0 in the mosaic you would pass an additional argument shuffle=True to the dataloader here: https://github.com/ultralytics/yolov5/blob/702c4fa53eeaecfa5563c1441cb4a0c4aa8e908e/utils/datasets.py#L66-L71

glenn-jocher commented 3 years ago

@bsugerman BTW, if you notice any improved results with this change please let us know!

NanoCode012 commented 3 years ago

Btw, I don't think you should set shuffle for DDP mode since we use sampler with the current code.

sampler (Sampler or Iterable, optional) – defines the strategy to draw samples from the dataset. Can be any Iterable with len implemented. If specified, shuffle must not be specified.

From https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader

An option would be to implement RandomSampler or similar for Single GPU.

https://pytorch.org/docs/stable/data.html

glenn-jocher commented 3 years ago

@NanoCode012 ah good point. In the end I doubt shuffling would have a significant impact, as @bsugerman mentioned the train set is already 75% shuffled even without any shuffling.

bsugerman commented 3 years ago

Also however, this means that training uses each image an average of 4 times per epoch.

@NanoCode012 ah good point. In the end I doubt shuffling would have a significant impact, as @bsugerman mentioned the train set is already 75% shuffled even without any shuffling.

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

glenn-jocher commented 2 years ago

@bsugerman @NanoCode012 good news 😃! Your original issue may now be fixed ✅ in PR #5623 by @werner-duvaud. This PR turns on shuffling in the YOLOv5 training DataLoader by default, which was missing until now. This works for all training formats: CPU, Single-GPU, Multi-GPU DDP.

train_loader, dataset = create_dataloader(train_path, imgsz, batch_size // WORLD_SIZE, gs, single_cls,
                                          hyp=hyp, augment=True, cache=opt.cache, rect=opt.rect, rank=LOCAL_RANK,
                                          workers=workers, image_weights=opt.image_weights, quad=opt.quad,
                                          prefix=colorstr('train: '), shuffle=True)  # <--- NEW

I evaluated this PR against master on VOC finetuning for 50 epochs, and the results show a slight improvement in most metrics and losses, particularly in objectness loss and mAP@0.5, perhaps indicating that the shuffle addition may help delay overtraining.

https://wandb.ai/glenn-jocher/VOC

Screenshot 2021-11-13 at 13 03 26

To receive this update:

Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀!