albumentations-team / albumentations

Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
https://albumentations.ai
MIT License
14.34k stars 1.65k forks source link

input keypoint num == input label num,output keypoint num != output label num #1312

Open yunshangyue71 opened 2 years ago

yunshangyue71 commented 2 years ago

🐛 Bug

Recently, I am studying human pose estimation. The number of input labels is the same as the number of kpts. After image enhancement, the number of kpts is reduced, but the number of corresponding labels is not reduced. Is this a bug?

To Reproduce

Steps to reproduce the behavior:

1.remove_invisible = True """ 数据处理、数据增强,会对 图片以及 gt box产生影响 """

import albumentations as A
import wandb
remove_invisible = True  # 经过图像增强后,关节点看不到后就没有这个
config = wandb.config
trans1 = A.Compose(
    [
    # 像素-常用
    A.Blur(p = config.blur),
    A.GaussNoise(p = config.gaussion_noise),
    A.HueSaturationValue(p = config.hue_saturation),
    A.RandomBrightness(p = config.brightness),
    A.RandomContrast(p = config.constract),
    A.Sharpen(p = config.sharpen),

    A.PixelDropout(p=config.pixel_dropout),
    A.ShiftScaleRotate(shift_limit_x=config.shift_limit_x,
                       shift_limit_y= config.shift_limit_y,
                       scale_limit= config.scale_limit,
                       rotate_limit= config.rotate_limit,
                       p = config.shift_scale_rotate),

    # 像素-不常用
    A.ZoomBlur(p = 0.05),
    A.ColorJitter(p = 0.05),
    A.RandomFog(p = 0.05),
    A.RandomRain(p=0.05),
    A.RandomSunFlare(p=0.05),

    # 空间常用
    A.HorizontalFlip(p = config.horizontal_flip),
    A.VerticalFlip(p = config.vertical_flip),

    A.Perspective(p = config.perspective),

],
keypoint_params=A.KeypointParams(format="xy",  remove_invisible=remove_invisible)
)

trans3 = A.Compose(
    [
    A.Resize(config.input_h, config.input_w, p=1),
    A.Normalize(mean=config.mean,
                std=config.std,
                p=1),
    ],
    keypoint_params=A.KeypointParams(format="xy", remove_invisible=remove_invisible)
)

def aug_keypoint2d_train(img, kpt, labels):
    # 执行图像增强
    trans1_result = trans1(image = img, keypoints = kpt, class_labels = labels)
    img = trans1_result["image"]
    kpt = trans1_result["keypoints"]
    labels = trans1_result["class_labels"]

    img_h, img_w, c = img.shape
    pad = max(img_h, img_w)
    trans2 = A.Compose([
    A.PadIfNeeded(pad, pad) # 靠近的反除去对折
        ],
    keypoint_params = A.KeypointParams(format="xy",  remove_invisible=remove_invisible)
    )
    trans2_result = trans2(image = img, keypoints = kpt, class_labels = labels)
    img = trans2_result["image"]
    kpt = trans2_result["keypoints"]
    labels = trans2_result["class_labels"]

    trans3_result = trans3(image = img, keypoints = kpt, class_labels = labels)
    img = trans3_result["image"]
    kpt = trans3_result["keypoints"]
    labels = trans3_result["class_labels"]
    return img, kpt, labels

def aug_keypoint2d_val(img, kpt, labels):
    # 执行图像增强
    # trans1_result = trans1(img, kpt)
    # img = trans1_result["image"]
    # kpt = trans1_result["keypoint"]

    img_h, img_w, c = img.shape
    pad = max(img_h, img_w)
    trans2 = A.Compose([
        A.PadIfNeeded(pad, pad)  # 靠近的反除去对折
    ],
        keypoint_params=A.KeypointParams(format="xy", remove_invisible=remove_invisible, class_sides=labels)
    )
    trans2_result = trans2(image=img, keypoints=kpt)
    img = trans2_result["image"]
    kpt = trans2_result["keypoints"]
    labels = trans2_result["class_labels"]

    trans3_result = trans3(image=img, keypoints=kpt, class_sides=labels)
    img = trans3_result["image"]
    kpt = trans3_result["keypoints"]
    labels = trans3_result["class_labels"]
    return img, kpt, labels

Expected behavior

Environment

Additional context

yunshangyue71 commented 2 years ago

I'm sorry, it's my use problem, I forgot to add label_fields=["class_labels"], I suggest you can set this as an error, otherwise, it's really hard to debug. Thank you for your open source project.

yunshangyue71 commented 2 years ago

After mirroring in the key points of the human body, the left and right hands will be marked wrong. What is the return information, indicating that the picture has been mirrored? Thank you

Dipet commented 2 years ago

Try to look to FlipSymmetricKeypoints