albumentations-team / albumentations

Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
https://albumentations.ai
MIT License
14.35k stars 1.65k forks source link

Unexpected Negative Coordinates in YOLOv8 Annotations after Transformations #2174

Open qweqweq1222 opened 13 hours ago

qweqweq1222 commented 13 hours ago

Description: I am encountering an issue while applying transformations to YOLOv8 annotations (object detection format) using albumentations. Below is a minimal reproducible example

import cv2
import albumentations as A

image = cv2.imread("....")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
bboxes = [[0.231627, 0.014903, 0.047738, 0.029807]]  
category_ids = [0]
category_id_to_name = {0: 'part'}

transform = A.Compose(
    [A.HorizontalFlip(p=0.5)],  # Horizontal flip transformation
    bbox_params=A.BboxParams(format='yolo', label_fields=['category_ids'])
)

transformed = transform(image=image, bboxes=bboxes, category_ids=category_ids)

Error: When applying different transformations (even an empty one), I consistently receive the following error

ValueError: Expected y_min for bbox [ 2.0775801e-01 -5.0012022e-07  2.5549600e-01  2.9806498e-02  0.0000000e+00] to be in the range [0.0, 1.0], got -5.00120222568512e-07.

Observations: Input image shape (672, 504, 3) The original bounding box coordinates are well within the valid range [0.0, 1.0]. Drawing the bounding boxes on the original image using a custom function shows correct placement and dimensions. The error consistently produces the same invalid numbers:

[2.0775801e-01, -5.0012022e-07, 2.5549600e-01, 2.9806498e-02, 0.0000000e+00]

Custom Function for Verification: Below is the custom function I used to verify the correctness of the bounding boxes by visualizing them on the image:

import cv2

def draw_bboxes(image_path, annotations_path):
    image = cv2.imread(image_path)
    height, width, _ = image.shape
    with open(annotations_path, 'r') as f:
        annotations = f.readlines()
        _, x_center, y_center, bbox_width, bbox_height = map(float, annotation.strip().split())
        x_center = int(x_center * width)
        y_center = int(y_center * height)
        bbox_width = int(bbox_width * width)
        bbox_height = int(bbox_height * height)
        x1 = int(x_center - bbox_width / 2)
        y1 = int(y_center - bbox_height / 2)
        x2 = int(x_center + bbox_width / 2)
        y2 = int(y_center + bbox_height / 2)

        cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)  # Green, line thickness = 2

    # Save the result
    cv2.imwrite('output_image.jpg', image)

Notes:

The bounding boxes drawn using this function align correctly with the objects in the image. The issue seems to be related to how albumentations processes the YOLO format during transformations. Questions: Where do the negative y_min values originate, given that the original annotations are valid? Is this a bug in the library, or is there a mistake in the way I define or pass the annotations? Any help in resolving this would be greatly appreciated! output_image

ternaus commented 9 hours ago

Passing

clip=True to BboxParams, should resolve the issue:

https://albumentations.ai/docs/api_reference/full_reference/?h=clip%3D#albumentations.core.bbox_utils.BboxParams