Open leiouultraman opened 2 months ago
If you'd like to apply OAMix data augmentation to the CityScapes dataset and generate a new augmented dataset while keeping the original images, you can follow this approach:
Modifying the Training Pipeline: Add the OAMix
transformation with the keep_orig
argument set to True
. This ensures that the original images are retained alongside the OA-Mixed images. Your pipeline would look like this:
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='Resize', img_scale=[(2048, 800), (2048, 1024)], keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='OAMix', version='augmix', num_views=2, keep_orig=True, ...), # <--
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'img2', 'gt_bboxes', 'gt_bboxes2', 'gt_labels', 'multilevel_boxes', 'oamix_boxes']), # <--
]
In this setup, the OAMix
transformation is applied to generate a new augmented image (img2
), while the original image (img
) is kept. The data loader will output both the original and the OA-Mixed images.
If you want to include a third image: If you wish to augment the dataset further by introducing a third image (e.g., for another augmentation), you can modify the pipeline to include an additional transformation (e.g., NewAugment
). This will generate an additional image (img3
) with its own bounding boxes (gt_bboxes3
):
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='Resize', img_scale=[(2048, 800), (2048, 1024)], keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='OAMix', version='augmix', num_views=2, keep_orig=True, ...),
dict(type='NewAugment', ...), # Add your custom augmentation here
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'img2', 'img3', 'gt_bboxes', 'gt_bboxes2', 'gt_bboxes3', 'gt_labels', 'multilevel_boxes', 'oamix_boxes']),
]
Additionally, you would define the NewAugment
class to handle the third image's augmentation:
class NewAugment:
def __init__(self, ...):
pass
def __call__(self, results, *args, **kwargs):
results['img3'] = self.another_augment(results['img'].copy(), ...)
return results
With this setup, the data_loader
will output the original image, the OA-Mixed image, and a third augmented image, each with its respective bounding boxes and labels.
Make sure that the keys in Collect
include all images (img
, img2
, img3
) and their associated annotations.
Thank you very much for your reply! Maybe I didn't express it clearly.What I mean is, write a Py file to enhance the image of the dataset (source) and save the enhanced image (augment). I wrote a piece of code myself, but there may be a problem.Looking forward to your reply.Following is my code import cv2 import numpy as np from oa_mix import OAMix from PIL import Image
def load_yolo_bboxes(txt_file, img_width, img_height): bboxes = [] with open(txt_file, 'r') as f: for line in f: class_id, x_center, y_center, width, height = map(float, line.strip().split()) x1 = (x_center - width / 2) img_width y1 = (y_center - height / 2) img_height x2 = (x_center + width / 2) img_width y2 = (y_center + height / 2) img_height bboxes.append([x1, y1, x2, y2]) return np.array(bboxes, dtype=np.float32)
def load_image(image_path): img = cv2.imread(image_path) if img is None: raise ValueError(f"Image {image_path} could not be loaded.") return img
def save_image(image_path, img): cv2.imwrite(image_path, img)
def main(image_path, label_path, output_image_path): img = load_image(image_path) h_img, wimg, = img.shape bboxes = load_yolo_bboxes(label_path, w_img, h_img) oamix = OAMix(version='augmix', num_views=1, keep_orig=False, severity=10, mixture_depth=2) results = {'img': img, 'gt_bboxes': bboxes} augmented_results = oamix(results) augmented_img = augmented_results['img'] save_image(output_image_path, augmented_img)
import os from tqdm import tqdm
if name == "main": image_folder = 'images/' label_folder = 'labels/' output_folder = 'augmented_images/'
if not os.path.exists(output_folder):
os.makedirs(output_folder)
image_files = [f for f in os.listdir(image_folder) if f.endswith('.jpg') or f.endswith('.png')]
for image_filename in tqdm(image_files, desc="Processing images"):
image_path = os.path.join(image_folder, image_filename)
label_path = os.path.join(label_folder, image_filename.replace('.jpg', '.txt').replace('.png', '.txt'))
output_image_path = os.path.join(output_folder, image_filename)
main(image_path, label_path, output_image_path)
The code you shared looks fine at first glance. Could you please provide the error logs, if any? That would help in identifying the issue more accurately and assisting you better.
Thanks for your reply!I have successfully run the code and generated a new augmented data set for experimentation, and found that the final effect is not even as good as the Augmix data augmentation method, so I have some questions
Thank you for sharing your feedback!
It's interesting to hear that the final effect of the generated augmented dataset is not performing as well as the AugMix augmentation method. There could be several factors influencing the results:
If you'd like, feel free to share additional details or suggestions for improving the augmentation process. Your insights can help us refine and enhance the method further.
Looking forward to your response!
Hello, I would like to ask a question about data augmentation. If I only want to apply OAMIX data augmentation on the CityScapes dataset and generate a new augmented dataset, how should I proceed? I really appreciate your help.