yjh0410 / CenterNet-plus

A Simple Baseline for Object Detection
55 stars 11 forks source link

Error in create_gt script #8

Closed YashRunwal closed 3 years ago

YashRunwal commented 3 years ago

Hello,

My Input image (grayscale) size is (512, 1536). I created my own dataset class. The targets it returns is:

targets = array([[ 686,  171, 1190,  369,    4], [ 156,  176,  639,  349,    4], [1031,  174, 1232,  279,    4]])
boxes = targets[:, :4]
labels = targets[:, 4]

As given in the train.py script, I selected the stride of 4 to use in gt_creatorfunction gt_creator((512, 1536), stride=4, num_classes=8, dataset[100])

But it gives me the following error:

gt_tensor[grid_y, grid_x, cls_id] = 1.0
IndexError: index 34560 is out of bounds for axis 0 with size 128

Therefore I tried to debug this by printing the results of the generate_txtytwth function and they are: (360192, 34560, 0.0, 0.0, 12.173218820659095, 10.140297294614152, -99790.0, 12902, 12902, 686, 171, 1190, 369)

However, the gt_tensor variable inside the gt_creatorfunction has shape: (128, 384, 17)

Any idea as to why this is happening? What needs to be changed? I would prefer not to use any augmentation techniques.

yjh0410 commented 3 years ago

In my project, an input image will be resized to a square image with same width and height, so the parameter "img_size" in gt_creator function is a int type, not a list type, but I notice that you have fixed this trouble.

As for the error you meet, I think it is because that you might forget to normalize your bbox by img_size. This is a common practice that the [x1, y1, x2, y2] of [cx, cy, w, h] shoule be normalized to the range of [0, 1]. You might not to do this, so you got a very big grid_x and grid_y higher than [128, 384].

YashRunwal commented 3 years ago

So I just have to divide the x values by width and y values by height of the image to normalize it, right? Also, If I normalize the annotations, I don't need to use any augmentations, right?

yjh0410 commented 3 years ago

Yes. In my project, you must normalize bboxes.

You still use augmentations during training stage, else you will get poor performance.

YashRunwal commented 3 years ago

The problem with my training images is that they are not normal RGB or Grayscale images. They are raw images and hence I still haven't figured out the augmentation techniques for those.

YashRunwal commented 3 years ago

@yjh0410 Can you perhaps explain what this gt_creatordoes? I went through the code but it is a little bit difficult to understand its functionality as there aren't any comments. I would happily add comments to the code.

Also, for anyone who had doubts about normalizing bounding boxes: You can use the following function. You need to change a few things if it is a stand-alone function, i.e. not written as a class method.

    def _normalize_bbox(self, image, tgts):
        """
        Normalize bounding boxes in the range [0, 1]
        :param image: np_image: shape: (512, 1536)
        :param tgts: annotations include [[xmin, ymin, xmax, ymax, class_id]]
        :return: normalized bounding boxes
        """
        height, width = image.shape[0], image.shape[1]  # (512, 1536)
        # tgts[:, :4]
        targets = []
        for annot in tgts:
            xmin = annot[0]/width
            ymin = annot[1]/height
            xmax = annot[2]/width
            ymax = annot[3]/height
            class_id = annot[4]
            targets.append([xmin, ymin, xmax, ymax, class_id])

        return np.squeeze(np.array(targets))