google-research / scenic

Scenic: A Jax Library for Computer Vision Research and Beyond
Apache License 2.0
3.14k stars 417 forks source link

How to convert co-ordinates given in minimal_colab to coco bounding boxes? #1060

Closed DishantMewada closed 2 months ago

DishantMewada commented 2 months ago

In the minimal colab example, there is

cx, cy, w, h = target_boxes[top_ind]

ax.plot(
    [cx - w / 2, cx + w / 2, cx + w / 2, cx - w / 2, cx - w / 2],
    [cy - h / 2, cy - h / 2, cy + h / 2, cy + h / 2, cy - h / 2],
    color='lime',
)

while printing cx, cy, w, h it gives me - [0.45544463 0.5008438 0.34489876 0.36101776]

I was wondering how the reshaping of the co-ordinates working. If I want to know the coco bounding box coordinates, how can I convert these values to the original image size?

Thank you.

DishantMewada commented 2 months ago

Alright, I think I have figured it out.

If you have square images in the dataset. X_MIN = (cx - w / 2) width_of_your_image Y_MIN = (cy - h / 2) height_of_your_image WIDTH = ((cx + w / 2) - (cx - w / 2)) width_of_your_image HEIGHT = ((cy + h / 2) - (cy - h / 2)) height_of_your_image

The problem arises when you don't have a square image. It shows a grey part at the bottom of the image.

By playing around with the code

ax.plot(
    [cx - w / 2, cx + w / 2, cx + w / 2, cx - w / 2, cx - w / 2],
    [cy - h / 2, cy - h / 2, cy + h / 2, cy + h / 2, cy - h / 2],
    color='lime',
)

Mainly changing (cy + h / 2) value, I found out that y_max is at around 0.68 Untitled

Since the resolution of my images was 704x480.

equivalent height = 480*1 / 0.68 = 705.88235294117

X_MIN = (cx - w / 2) 704 Y_MIN = (cy - h / 2) 705.89 WIDTH = ((cx + w / 2) - (cx - w / 2)) 704 HEIGHT = ((cy + h / 2) - (cy - h / 2)) 705.89

I am closing the issue. If I am doing something wrong please let me know and open the issue again.

Thanks.