ultralytics / yolov3

YOLOv3 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
10.1k stars 3.44k forks source link

mistake happen when combine mosaic and rotation #629

Closed mozpp closed 4 years ago

mozpp commented 4 years ago

Mistake happen when combine mosaic and rotation 图片

mozpp commented 4 years ago

@glenn-jocher, I add code below into function load_mosaic, this may fix my situation, not sure for all cases.

# # reject warped points outside of image
        labels4_tmp=labels4.copy()
        area0 = (labels4[:, 3] - labels4[:, 1]) * (labels4[:, 4] - labels4[:, 2])
        labels4_tmp[:, [2, 4]] = labels4_tmp[:, [2, 4]].clip(0, s)
        labels4_tmp[:, [1, 3]] = labels4_tmp[:, [1, 3]].clip(0, s)
        w = labels4_tmp[:, 3] - labels4_tmp[:, 1]
        h = labels4_tmp[:, 4] - labels4_tmp[:, 2]
        area = w * h

        ar = np.maximum(w / (h + 1e-16), h / (w + 1e-16))
        i = (w > 2) & (h > 2) & (area / (area0 + 1e-16) > 0.1) & (ar < 10)
        # i = (w > 2) & (h > 2) & (ar < 10)

        labels4 = labels4_tmp[i]

https://github.com/mozpp/yolov3/blob/b1f08dc2b4da3802669de3f8e129b7fe9d2b8c2e/utils/datasets.py#L612 图片

glenn-jocher commented 4 years ago

@mozpp thanks for the info! Where are you augmenting your mosaics? We have two augment locations, both of which we've disabled mostly. The first one is responsible for clipping boxes on the base grey image, the second one that is commented would be responsible for clipping on the mosaic itself.

We were not successful in using rotation to increase mAP on COCO. Does it improve results on your own custom dataset?

https://github.com/ultralytics/yolov3/blob/9c716a39c369cafd7cbe50ab3198c1df59749149/utils/datasets.py#L450-L458

https://github.com/ultralytics/yolov3/blob/9c716a39c369cafd7cbe50ab3198c1df59749149/utils/datasets.py#L596-L602

mozpp commented 4 years ago

@mozpp thanks for the info! Where are you augmenting your mosaics? We have two augment locations, both of which we've disabled mostly. The first one is responsible for clipping boxes on the base grey image, the second one that is commented would be responsible for clipping on the mosaic itself.

We were not successful in using rotation to increase mAP on COCO. Does it improve results on your own custom dataset?

https://github.com/ultralytics/yolov3/blob/9c716a39c369cafd7cbe50ab3198c1df59749149/utils/datasets.py#L450-L458

https://github.com/ultralytics/yolov3/blob/9c716a39c369cafd7cbe50ab3198c1df59749149/utils/datasets.py#L596-L602

Thanks for reply, my code is below: https://github.com/mozpp/yolov3/blob/b1f08dc2b4da3802669de3f8e129b7fe9d2b8c2e/utils/datasets.py#L615

def load_mosaic(self, index):
    # loads images in a mosaic

    labels4 = []
    s = self.img_size
    xc, yc = [int(random.uniform(s * 0.5, s * 1.5)) for _ in range(2)]  # mosaic center x, y
    img4 = np.zeros((s * 2, s * 2, 3), dtype=np.uint8) + 128  # base image with 4 tiles
    indices = [index] + [random.randint(0, len(self.labels) - 1) for _ in range(3)]  # 3 additional image indices
    for i, index in enumerate(indices):
        # Load image
        img = load_image(self, index)
        h, w, _ = img.shape

        # place img in img4
        if i == 0:  # top left
            x1a, y1a, x2a, y2a = max(xc - w, 0), max(yc - h, 0), xc, yc  # xmin, ymin, xmax, ymax (large image)
            x1b, y1b, x2b, y2b = w - (x2a - x1a), h - (y2a - y1a), w, h  # xmin, ymin, xmax, ymax (small image)
        elif i == 1:  # top right
            x1a, y1a, x2a, y2a = xc, max(yc - h, 0), min(xc + w, s * 2), yc
            x1b, y1b, x2b, y2b = 0, h - (y2a - y1a), min(w, x2a - x1a), h
        elif i == 2:  # bottom left
            x1a, y1a, x2a, y2a = max(xc - w, 0), yc, xc, min(s * 2, yc + h)
            x1b, y1b, x2b, y2b = w - (x2a - x1a), 0, max(xc, w), min(y2a - y1a, h)
        elif i == 3:  # bottom right
            x1a, y1a, x2a, y2a = xc, yc, min(xc + w, s * 2), min(s * 2, yc + h)
            x1b, y1b, x2b, y2b = 0, 0, min(w, x2a - x1a), min(y2a - y1a, h)

        img4[y1a:y2a, x1a:x2a] = img[y1b:y2b, x1b:x2b]  # img4[ymin:ymax, xmin:xmax]
        padw = x1a - x1b
        padh = y1a - y1b

        # Load labels
        label_path = self.label_files[index]
        if os.path.isfile(label_path):
            x = self.labels[index]
            if x is None:  # labels not preloaded
                with open(label_path, 'r') as f:
                    x = np.array([x.split() for x in f.read().splitlines()], dtype=np.float32)

            # labels = []  # fix issue #548
            if x.size > 0:
                # Normalized xywh to pixel xyxy format
                labels = x.copy()
                labels[:, 1] = w * (x[:, 1] - x[:, 3] / 2) + padw
                labels[:, 2] = h * (x[:, 2] - x[:, 4] / 2) + padh
                labels[:, 3] = w * (x[:, 1] + x[:, 3] / 2) + padw
                labels[:, 4] = h * (x[:, 2] + x[:, 4] / 2) + padh

                labels4.append(labels) # fix issue #548
    if len(labels4):
        labels4 = np.concatenate(labels4, 0)

    # hyp = self.hyp
    # img4, labels4 = random_affine(img4, labels4,
    #                               degrees=hyp['degrees'],
    #                               translate=hyp['translate'],
    #                               scale=hyp['scale'],
    #                               shear=hyp['shear'])

    # Center crop
    a = s // 2
    img4 = img4[a:a + s, a:a + s]
    if len(labels4):
        labels4[:, 1:] -= a

        # # reject warped points outside of image
        labels4_tmp=labels4.copy()
        area0 = (labels4[:, 3] - labels4[:, 1]) * (labels4[:, 4] - labels4[:, 2])
        labels4_tmp[:, [2, 4]] = labels4_tmp[:, [2, 4]].clip(0, s)
        labels4_tmp[:, [1, 3]] = labels4_tmp[:, [1, 3]].clip(0, s)
        w = labels4_tmp[:, 3] - labels4_tmp[:, 1]
        h = labels4_tmp[:, 4] - labels4_tmp[:, 2]
        area = w * h

        ar = np.maximum(w / (h + 1e-16), h / (w + 1e-16))
        i = (w > 2) & (h > 2) & (area / (area0 + 1e-16) > 0.1) & (ar < 10)
        # i = (w > 2) & (h > 2) & (ar < 10)

        labels4 = labels4_tmp[i]

    return img4, labels4

I will report the ablation study after training.

glenn-jocher commented 4 years ago

Great, thanks! Yes please do report study results.

glenn-jocher commented 4 years ago

@mozpp I've fixed this officially now in https://github.com/ultralytics/yolov3/commit/cba3120ca6171c9e470cbb4ca857a12aafb31b7e

Mosaic images (with increased rotation and translation hyps) look like this now: train_batch0