albumentations-team / albumentations

Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
https://albumentations.ai
MIT License
14.03k stars 1.63k forks source link

[TensorFlow] CoarseDropout produces black stripes in random positions on the image #911

Closed roma-glushko closed 3 years ago

roma-glushko commented 3 years ago

🐛 Bug

I'm using albumentations with TF-GPU 2.5 and in this setup CoarseDropout augmentation produces black stripes on my training images for some reasons:

Screenshot 2021-05-27 at 13 03 37

Here is how my data loading happens:

def augment_image(inputs, labels, augmentation_pipeline: a.Compose, seed: int = 42):
    def apply_augmentation(images):
        random.seed(seed)
        np.random.seed(seed)

        aug_data = augmentation_pipeline(image=images.astype('uint8'))

        return aug_data['image']

    inputs = tf.numpy_function(func=apply_augmentation, inp=[inputs], Tout=tf.uint8)

    return inputs, labels

def get_dataset(
        dataset_path: str,
        subset_type: str,
        augmentation_pipeline: a.Compose,
        validation_fraction: float = 0.2,
        batch_size: int = 32,
        image_size: Tuple[int, int] = (300, 300),
        seed: int = 42
) -> tf.data.Dataset:
    augmentation_func = partial(
        augment_image,
        augmentation_pipeline=augmentation_pipeline,
        seed=seed,
    )

    dataset = image_dataset_from_directory(
        dataset_path,
        subset=subset_type,
        class_names=class_names,
        validation_split=validation_fraction,
        image_size=image_size,
        batch_size=batch_size,
        seed=seed,
    )

    return dataset \
        .map(augmentation_func, num_parallel_calls=AUTOTUNE) \
        .prefetch(AUTOTUNE)

Then I run the following snippet and get the strips on my examples:

train_dataset = get_dataset(
    config.train_dataset_path,
    'training',
    config.train_augmentation,
    validation_fraction=0.2,
    batch_size=config.batch_size,
    image_size=config.image_size,
    seed=config.seed,
)

plt.figure(figsize=(10, 10))

for image_batch, _ in train_dataset.take(1):
    for idx in range(9):
        image = image_batch[idx].numpy().astype('uint8')

        ax = plt.subplot(3, 3, idx + 1)
        plt.imshow(image)
        plt.axis('off')

The stripes go away when I comment CoarseDropout augmentation in my augmentation pipeline which looks like this during:

args['train_augmentation'] = a.Compose([
    a.VerticalFlip(),
    a.HorizontalFlip(),
    a.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.1, brightness_by_max=False),
    a.CoarseDropout(p=0.0, max_holes=20, max_height=8, max_width=8, min_holes=10, min_height=8, min_width=8),
    a.GaussNoise(p=1.0, var_limit=(10.0, 50.0)),
])
Screenshot 2021-05-27 at 13 05 29

To Reproduce

Steps to reproduce the behavior:

  1. Clone the project state at 0.1.0-bugrep tag:

    git clone --depth 1 --branch 0.1.0-bugrep https://github.com/roma-glushko/rock-paper-scissor
  2. Pull dataset:

    cd data
    kaggle datasets download --unzip frtgnn/rock-paper-scissor
  3. Install project deps:

    poetry install
  4. Make sure CoarseDropout augmentation is always on in the config file: https://github.com/roma-glushko/rock-paper-scissor/blob/master/configs/basic_config.py

  5. Run a notebook https://github.com/roma-glushko/rock-paper-scissor/blob/master/notebooks/data_augmentation.ipynb

Expected behavior

I always expect to see CoarseDropouts as a rectangles of the defined size:

Screenshot 2021-05-27 at 13 02 53

Environment

Dipet commented 3 years ago

I can not reproduce the problem.

import albumentations as a
import cv2 as cv
import matplotlib.pyplot as plt

augs = a.Compose([
    a.VerticalFlip(p=1),
    a.HorizontalFlip(p=1),
    a.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.1, brightness_by_max=False, p=1),
    a.CoarseDropout(p=1.0, max_holes=20, max_height=8, max_width=8, min_holes=10, min_height=8, min_width=8),
    a.GaussNoise(p=1.0, var_limit=(10.0, 50.0)),
])

img = cv.imread("/home/dipet/Pictures/paper01-000.png")
img = cv.cvtColor(img, cv.COLOR_BGR2RGB)

plt.subplot(211)
plt.imshow(img, vmin=0, vmax=255)
plt.subplot(212)
plt.imshow(augs(image=img)["image"], vmin=0, vmax=255)
plt.show()

Could you give random seed to reproduce the problem? Or give dump of applied args from ReplayCompose and images for these arguments

roma-glushko commented 3 years ago

@Dipet thank you for replay!

It makes sense to me that the snippet above did not help to reproduce the issue. I feel like image_dataset_from_directory() has something to do with the issue (may be related to https://github.com/albumentations-team/albumentations/issues/905), so there would be more odds to reproduce it if you try to load images in the same way as me.

In any case, here is an achieve with ablumentations state after running:

for image_batch, _ in train_dataset.take(1):
    for idx in range(9):
        image = image_batch[idx].numpy().astype('uint8')

        ax = plt.subplot(3, 3, idx + 1)
        plt.imshow(image)
        plt.axis('off')
Screenshot 2021-05-27 at 13 45 47

https://drive.google.com/file/d/13Tf-iFM7hjBqH7jntys3SqFHQdE0BDPG/view?usp=sharing

Dipet commented 3 years ago

It looks like you are trying to apply augmentation to batch of images. Try to change line

aug_data = augmentation_pipeline(image=images.astype('uint8'))

to:

res_images = []
for img in images:
    aug_data = augmentation_pipeline(image=img.astype('uint8'))
    res_images.append(aug_data["image"])
return np.stack(res_images)
roma-glushko commented 3 years ago

@Dipet yes, seems like the reason of the issue. Thank you for the help 👍