Can the YOLO model be modified to predict 4 coordinates of corners of the bounding box around the object??

pjreddie / darknet

Convolutional Neural Networks

http://pjreddie.com/darknet/

Other

25.69k stars 21.33k forks source link

Can the YOLO model be modified to predict 4 coordinates of corners of the bounding box around the object?? #1864

Open kavyajeetbora opened 4 years ago

kavyajeetbora commented 4 years ago

I want to make a model which can predict coordinates of all 4 corners of the bounding boxes. Because in my training dataset instead of horizontal rectangular boxes, the objects are bounded by polygons at different orientations. this is how the box looks like - (x1,y1)_____(x2,y2)

(x4,y4)____(x3,y3)

So in order to make that can the model be modified to predict the 8 coordinates - (x1,x2,x3,x4,y1,y2,y3,y4)

I have tried training one using Tiny YOLO CNN architecture but I am finding it difficult to train it. PS: I have normalised the coordinates in in range [0,1]

jman278 commented 4 years ago

Have you been able to solve this?

kavyajeetbora commented 4 years ago

Have you been able to solve this?

YOLO is not the model for predicting polygons If you want to train a model to predict irregular polygons you can explore Text Scene detection models like EAST, CRAFT, PSENet etc

jman278 commented 4 years ago

Have you been able to solve this?

YOLO is not the model for predicting polygons If you want to train a model to predict irregular polygons you can explore Text Scene detection models like EAST, CRAFT, PSENet etc

Thanks for the response. However, none of the ones you mentioned can be applied to objects in general. They are specific to text. Are you aware of any methods that can be trained to predict say a quadrilateral around the object?

kavyajeetbora commented 4 years ago

Have you been able to solve this?

YOLO is not the model for predicting polygons If you want to train a model to predict irregular polygons you can explore Text Scene detection models like EAST, CRAFT, PSENet etc

Thanks for the response. However, none of the ones you mentioned can be applied to objects in general. They are specific to text. Are you aware of any methods that can be trained to predict say a quadrilateral around the object?

You can train them with your own annotation files with 8 coordinates you can also go for image segmentation models

Sgsouham commented 2 years ago

you can use dota_to-yolo converter to convert your 8point coordinates to yolo feedable format. I did this with my dataset.

utkarsh1711 commented 1 year ago

Try this - https://github.com/feifeiwei/OBB-YOLOv3