Data augmentation issue

guillermovc commented 1 year ago

Hello, I don't know if the mosaic data augmentation its affecting my training, i see weird mosaics in the training folder like the ones i will attach, where you can see that the boxes are obviously not in the right place. I have already checked the labels several times and i see them normal (i'm ussing Roboflow). I'm also having low scores for the accuracy, so i think here is the problem. I appreciate any help, thanks. train_batch9 train_batch3

yulin010101 commented 1 year ago

How do you get the example image Will you get better training results if you turn off mosaic?

lin-fangzhou commented 1 year ago

Hello, I have the same problem. I want to ask, how to deal with this problem?

yeldarby commented 1 year ago

Could you link to the dataset on Roboflow? I can have a look and see if it looks like it's on the data export end or the model ingestion/dataloader end.

simonlee6969 commented 1 year ago

Hi, have you solved this problem? I have been facing the same issue

guillermovc commented 1 year ago

How do you get the example image Will you get better training results if you turn off mosaic?

I got it from the training result folder, that shows validation images vs ground truth images

guillermovc commented 1 year ago

Hello, I have the same problem. I want to ask, how to deal with this problem?

I was thinking that it might be Roboflow problem, i had some issues with it (not saving for example), are you using Roboflow for labeling as well?

guillermovc commented 1 year ago

Hi, have you solved this problem? I have been facing the same issue

Are you using Roboflow to label your images?

simonlee6969 commented 1 year ago

Hi, have you solved this problem? I have been facing the same issue

Are you using Roboflow to label your images?

Yes I am using roboflow

guillermovc commented 1 year ago

Hi, have you solved this problem? I have been facing the same issue

Are you using Roboflow to label your images?

Yes I am using roboflow

Would you please try to run this script?, it draws all the bounding boxes that are in the txt files, check if the boxes are correct, you should create a folder called "show" (or replace the line 31), and replace line 5 with your data folder (the one that contains .yaml, train, test, and val folder).

import os
import sys
import cv2

DATASET_PATH = r"data/dataset_enero30_noaug/"
PATHS = ["train", "valid", "test"]

for path in PATHS:
    imgs_path = os.path.join(DATASET_PATH, path, "images")
    labels_path = os.path.join(DATASET_PATH, path, "labels")
    images = [os.path.join(imgs_path, img) for img in os.listdir(imgs_path)]
    labels = [os.path.join(labels_path, label) for label in os.listdir(labels_path)]

    for image, label in zip(images, labels):
        img = cv2.imread(image)
        with open(label, "r") as f:
            labels = f.readlines()

        img_height, img_width = img.shape[:2]

        for label in labels:
            label = label.replace("\n", "")
            cls, c1, c2, w, h = label.split()
            x1 = int(float(c1)*img_width - (float(w)*img_width)/2)
            x2 = int(float(c1)*img_width + (float(w)*img_width)/2)
            y1 = int(float(c2)*img_height - (float(h)*img_height)/2)
            y2 = int(float(c2)*img_height + (float(h)*img_height)/2)
            cv2.circle(img, (int(img_width * float(c1)), int(img_height * float(c2))), 3, (0,255,0), -1)
            cv2.rectangle(img, (x1, y1), (x2, y2), (0,255,0), 3)

        cv2.imwrite(os.path.join(DATASET_PATH, "show", image.split("images")[-1][1:]), img)

simonlee6969 commented 1 year ago

Hi, have you solved this problem? I have been facing the same issue

Are you using Roboflow to label your images?

Yes I am using roboflow

Would you please try to run this script?, it draws all the bounding boxes that are in the txt files, check if the boxes are correct, you should create a folder called "show" (or replace the line 31), and replace line 5 with your data folder (the one that contains .yaml, train, test, and val folder).

import os
import sys
import cv2

DATASET_PATH = r"data/dataset_enero30_noaug/"
PATHS = ["train", "valid", "test"]

for path in PATHS:
    imgs_path = os.path.join(DATASET_PATH, path, "images")
    labels_path = os.path.join(DATASET_PATH, path, "labels")
    images = [os.path.join(imgs_path, img) for img in os.listdir(imgs_path)]
    labels = [os.path.join(labels_path, label) for label in os.listdir(labels_path)]

    for image, label in zip(images, labels):
        img = cv2.imread(image)
        with open(label, "r") as f:
            labels = f.readlines()

        img_height, img_width = img.shape[:2]

        for label in labels:
            label = label.replace("\n", "")
            cls, c1, c2, w, h = label.split()
            x1 = int(float(c1)*img_width - (float(w)*img_width)/2)
            x2 = int(float(c1)*img_width + (float(w)*img_width)/2)
            y1 = int(float(c2)*img_height - (float(h)*img_height)/2)
            y2 = int(float(c2)*img_height + (float(h)*img_height)/2)
            cv2.circle(img, (int(img_width * float(c1)), int(img_height * float(c2))), 3, (0,255,0), -1)
            cv2.rectangle(img, (x1, y1), (x2, y2), (0,255,0), 3)

        cv2.imwrite(os.path.join(DATASET_PATH, "show", image.split("images")[-1][1:]), img)

I afraid that I cannot run this code because some of my labels are polygon bbox label which may contain coordinate that have more than those 4 points

guillermovc commented 1 year ago

I think this modification will work, just take the first 4 elements of each line (bbox info), it will draw the bbox of your polygon.

import os
import sys
import cv2

DATASET_PATH = r"data/dataset_enero30_noaug/"
PATHS = ["train", "valid", "test"]

for path in PATHS:
    imgs_path = os.path.join(DATASET_PATH, path, "images")
    labels_path = os.path.join(DATASET_PATH, path, "labels")
    images = [os.path.join(imgs_path, img) for img in os.listdir(imgs_path)]
    labels = [os.path.join(labels_path, label) for label in os.listdir(labels_path)]

    for image, label in zip(images, labels):
        img = cv2.imread(image)
        with open(label, "r") as f:
            labels = f.readlines()

        img_height, img_width = img.shape[:2]

        for label in labels:
            label = label.replace("\n", "")
            cls, c1, c2, w, h = label.split()[:4]
            x1 = int(float(c1)*img_width - (float(w)*img_width)/2)
            x2 = int(float(c1)*img_width + (float(w)*img_width)/2)
            y1 = int(float(c2)*img_height - (float(h)*img_height)/2)
            y2 = int(float(c2)*img_height + (float(h)*img_height)/2)
            cv2.circle(img, (int(img_width * float(c1)), int(img_height * float(c2))), 3, (0,255,0), -1)
            cv2.rectangle(img, (x1, y1), (x2, y2), (0,255,0), 3)

        cv2.imwrite(os.path.join(DATASET_PATH, "show", image.split("images")[-1][1:]), img)

HamidrezaZarrabi commented 9 months ago

Hi, have you solved this problem? I have been facing the same issue

Are you using Roboflow to label your images?

Yes I am using roboflow

Hi, have you solved this problem? I labled with another tool from here. It was ok and easy.

WongKinYiu / yolov7

Data augmentation issue #1442