WoojuLee24 / OA-DG

Object-Aware Domain Generalization for Object Detection
GNU General Public License v3.0
44 stars 7 forks source link

OAMIX #13

Open leiouultraman opened 3 weeks ago

leiouultraman commented 3 weeks ago

Hello, I would like to ask a question about data augmentation. If I only want to apply OAMIX data augmentation on the CityScapes dataset and generate a new augmented dataset, how should I proceed? I really appreciate your help.

dazory commented 1 day ago

If you'd like to apply OAMix data augmentation to the CityScapes dataset and generate a new augmented dataset while keeping the original images, you can follow this approach:

  1. Modifying the Training Pipeline: Add the OAMix transformation with the keep_orig argument set to True. This ensures that the original images are retained alongside the OA-Mixed images. Your pipeline would look like this:

    train_pipeline = [
       dict(type='LoadImageFromFile'),
       dict(type='LoadAnnotations', with_bbox=True),
       dict(type='Resize', img_scale=[(2048, 800), (2048, 1024)], keep_ratio=True),
       dict(type='RandomFlip', flip_ratio=0.5),
       dict(type='OAMix', version='augmix', num_views=2, keep_orig=True, ...), # <--
       dict(type='Normalize', **img_norm_cfg),
       dict(type='Pad', size_divisor=32),
       dict(type='DefaultFormatBundle'),
       dict(type='Collect', keys=['img', 'img2', 'gt_bboxes', 'gt_bboxes2', 'gt_labels', 'multilevel_boxes', 'oamix_boxes']), # <--
    ]

    In this setup, the OAMix transformation is applied to generate a new augmented image (img2), while the original image (img) is kept. The data loader will output both the original and the OA-Mixed images.

  2. If you want to include a third image: If you wish to augment the dataset further by introducing a third image (e.g., for another augmentation), you can modify the pipeline to include an additional transformation (e.g., NewAugment). This will generate an additional image (img3) with its own bounding boxes (gt_bboxes3):

    train_pipeline = [
       dict(type='LoadImageFromFile'),
       dict(type='LoadAnnotations', with_bbox=True),
       dict(type='Resize', img_scale=[(2048, 800), (2048, 1024)], keep_ratio=True),
       dict(type='RandomFlip', flip_ratio=0.5),
       dict(type='OAMix', version='augmix', num_views=2, keep_orig=True, ...),
       dict(type='NewAugment', ...), # Add your custom augmentation here
       dict(type='Normalize', **img_norm_cfg),
       dict(type='Pad', size_divisor=32),
       dict(type='DefaultFormatBundle'),
       dict(type='Collect', keys=['img', 'img2', 'img3', 'gt_bboxes', 'gt_bboxes2', 'gt_bboxes3', 'gt_labels', 'multilevel_boxes', 'oamix_boxes']),
    ]

    Additionally, you would define the NewAugment class to handle the third image's augmentation:

    class NewAugment:
       def __init__(self, ...):
           pass
       def __call__(self, results, *args, **kwargs):
           results['img3'] = self.another_augment(results['img'].copy(), ...)
           return results

With this setup, the data_loader will output the original image, the OA-Mixed image, and a third augmented image, each with its respective bounding boxes and labels.

Make sure that the keys in Collect include all images (img, img2, img3) and their associated annotations.