rykov8 / ssd_keras

Port of Single Shot MultiBox Detector to Keras
MIT License
1.1k stars 553 forks source link

Bug in random cropping? #39

Closed juntingzh closed 7 years ago

juntingzh commented 7 years ago

In the ssd_training.ipynb, I found the following code in the function of random_sized_crop seems problematic:

if (x_rel < cx < x_rel + w_rel and y_rel < cy < y_rel + h_rel):
                    xmin = (box[0] - x) / w_rel
                    ymin = (box[1] - y) / h_rel
                    xmax = (box[2] - x) / w_rel
                    ymax = (box[3] - y) / h_rel

Since the coordinates are box[:] are relative coordinates, I think these lines should be

if (x_rel < cx < x_rel + w_rel and y_rel < cy < y_rel + h_rel):
                    xmin = (box[0] - x_rel) / w_rel
                    ymin = (box[1] - y_rel) / h_rel
                    xmax = (box[2] - x_rel) / w_rel
                    ymax = (box[3] - y_rel) / h_rel
rykov8 commented 7 years ago

@juntingzh yes, probably, you're right. I know that methodrandom_sized_crop has bugs. I hope to fix it as soon as possible. I have the working version of this method on a server, but don't have access to it now.

chenwgen commented 7 years ago

In the class Generator, what does self.crop_attempts do?

rykov8 commented 7 years ago

@juntingzh @chenwgen here is the correct implementation, I believe:

def random_sized_crop(self, img, targets):
        img_w = img.shape[1]
        img_h = img.shape[0]
        img_area = img_w * img_h
        random_scale = np.random.random()
        random_scale *= (self.crop_area_range[1] -
                         self.crop_area_range[0])
        random_scale += self.crop_area_range[0]
        target_area = random_scale * img_area
        random_ratio = np.random.random()
        random_ratio *= (self.aspect_ratio_range[1] -
                         self.aspect_ratio_range[0])
        random_ratio += self.aspect_ratio_range[0]
        w = np.round(np.sqrt(target_area * random_ratio))     
        h = np.round(np.sqrt(target_area / random_ratio))
        if np.random.random() < 0.5:
            w, h = h, w
        w = min(w, img_w)
        w_rel = w / img_w
        w = int(w)
        h = min(h, img_h)
        h_rel = h / img_h
        h = int(h)
        x = np.random.random() * (img_w - w)
        x_rel = x / img_w
        x = int(x)
        y = np.random.random() * (img_h - h)
        y_rel = y / img_h
        y = int(y)
        img = img[y:y+h, x:x+w]
        new_targets = []
        for box in targets:
            cx = 0.5 * (box[0] + box[2])
            cy = 0.5 * (box[1] + box[3])
            if (x_rel < cx < x_rel + w_rel and
                y_rel < cy < y_rel + h_rel):
                xmin = (box[0] - x_rel) / w_rel
                ymin = (box[1] - y_rel) / h_rel
                xmax = (box[2] - x_rel) / w_rel
                ymax = (box[3] - y_rel) / h_rel
                xmin = max(0, xmin)
                ymin = max(0, ymin)
                xmax = min(1, xmax)
                ymax = min(1, ymax)
                box[:4] = [xmin, ymin, xmax, ymax]
                new_targets.append(box)
        new_targets = np.asarray(new_targets).reshape(-1, targets.shape[1])
        return img, new_targets