parap1uie-s / Keras-RFCN

RFCN implement based on Keras&Tensorflow
MIT License
72 stars 23 forks source link

Training on a simple (shapes) dataset #7

Open gaborvecsei opened 6 years ago

gaborvecsei commented 6 years ago


I tried to train on the included dataset.

(Code is from the MaskRCNN repo, I just created the load_bbox function.)

The problem is that in training time my losses go lower and lower beautifully (on the train and validation set too), but when I would like to test it, the output is disappointing. On a simple dataset like that it should over-fit pretty fast.

class ShapesDataset(Dataset):
    def load_shapes(self, count, height, width):
        self.add_class("shapes", 1, "square")
        self.add_class("shapes", 2, "circle")
        self.add_class("shapes", 3, "triangle")

        for i in range(count):
            bg_color, shapes = self.random_image(height, width)
            self.add_image("shapes", image_id=i, path=None,
                           width=width, height=height,
                           bg_color=bg_color, shapes=shapes)

    def load_image(self, image_id):
        info = self.image_info[image_id]
        bg_color = np.array(info['bg_color']).reshape([1, 1, 3])
        image = np.ones([info['height'], info['width'], 3], dtype=np.uint8)
        image = image * bg_color.astype(np.uint8)
        for shape, color, dims in info['shapes']:
            image = self.draw_shape(image, shape, dims, color)
        return image

    def image_reference(self, image_id):
        info = self.image_info[image_id]
        if info["source"] == "shapes":
            return info["shapes"]
            super(self.__class__).image_reference(self, image_id)

    def get_keys(self, d, value):
        return [k for k, v in d.items() if v == value]

    def load_bbox(self, image_id):
        info = self.image_info[image_id]
        shapes = info['shapes']
        count = len(shapes)
        mask = np.zeros([info['height'], info['width'], count], dtype=np.uint8)
        for i, (shape, _, dims) in enumerate(info['shapes']):
            mask[:, :, i:i + 1] = self.draw_shape(mask[:, :, i:i + 1].copy(),
                                                  shape, dims, 1)

        bboxes = []
        for i in range(mask.shape[2]):
            _, cnts, _ = cv2.findContours(mask[:, :, i] * 255, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
            cnt = sorted(cnts, key=cv2.contourArea, reverse=True)[0]

            x, y, w, h = cv2.boundingRect(cnt)
            # The right format is:
            # (y1, x1, y2, x2)
            bboxes.append([y, x, y + h, x + w])

        class_ids = np.array([self.class_names.index(s[0]) for s in shapes])

        if len(class_ids) != len(bboxes):
            raise ValueError("Class ids are not equal with num of bboxes")

        return np.array(bboxes), np.array(class_ids)

    def draw_shape(self, image, shape, dims, color):
        x, y, s = dims
        if shape == 'square':
            image = cv2.rectangle(image, (x - s, y - s),
                                  (x + s, y + s), color, -1)
        elif shape == "circle":
            image =, (x, y), s, color, -1)
        elif shape == "triangle":
            points = np.array([[(x, y - s),
                                (x - s / math.sin(math.radians(60)), y + s),
                                (x + s / math.sin(math.radians(60)), y + s),
                                ]], dtype=np.int32)
            image = cv2.fillPoly(image, points, color)
        return image

    def random_shape(self, height, width):
        shape = random.choice(["square", "circle", "triangle"])
        color = tuple([random.randint(0, 255) for _ in range(3)])
        buffer = 20
        y = random.randint(buffer, height - buffer - 1)
        x = random.randint(buffer, width - buffer - 1)
        s = random.randint(buffer, height // 4)
        return shape, color, (x, y, s)

    def random_image(self, height, width):
        # bg_color = np.array([random.randint(0, 255) for _ in range(3)])
        bg_color = np.array([0, 0, 0], dtype=np.uint8)
        shapes = []
        boxes = []
        N = random.randint(1, 2)
        for _ in range(N):
            shape, color, dims = self.random_shape(height, width)
            shapes.append((shape, color, dims))
            x, y, s = dims
            boxes.append([y - s, x - s, y + s, x + s])
        keep_ixs = Utils.non_max_suppression(
            np.array(boxes), np.arange(N), 0.3)
        shapes = [s for i, s in enumerate(shapes) if i in keep_ixs]
        return bg_color, shapes


All the boxes here had a 0.9 conf or above.

Do you have any idea what causes this? Could you try it out, so we would have a simple "tutorial" not like with the fashion dataset?

parap1uie-s commented 6 years ago

Hi, @gaborvecsei It seems that your implementation has no obvious errors. So I will check it out in the next few days. Right now I'm troubled by other projects, so I could not give you some useful advise right now. I'll leave this issue open until, and, Sorry.

gaborvecsei commented 6 years ago

Hi @parap1uie-s Thank you for your response! I am looking forward to hear your thoughts on this issue. I hope we can find a solution. Btw, Thank you very much for your effort in this implementation!

gaborvecsei commented 6 years ago

I had a little time and I tried to find out what went wrong:

Btw, I have not made these modifications here, but I grabbed the original MaskRCNN implementation and removed the mask layer and inserted the score maps and vote layers

Unfortunately the results are still not the best.


parap1uie-s commented 5 years ago

Hi, @gaborvecsei

Sorry for late reply, again.

I will check the models to fix bugs and improve performance, in a few days.