AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.77k stars 7.96k forks source link

Rotated Bounding Box output #2148

Open BGA3 opened 5 years ago

BGA3 commented 5 years ago

Hi This is not really an issue/bug, but more a request. Would it be possible to output rotated bounding boxes (non-axis alligned)? Would this be a huge change? I have started looking into the code to implement this myself, however as I dont know the code the learning curve is quite steep.

Thx in advance

AlexeyAB commented 5 years ago

@BGA3 Hi,

I think there will be many changes, at least there:

  1. So you should add +1 coordinate angle, so filters=(classes + 6)x3 instead of filters=(classes + 5)x3 in the cfg-file.

  2. https://github.com/AlexeyAB/darknet/blob/48d461f9bda2d01dc2a83c7c2a520f61b3ea5b79/src/box.h#L18-L20

  3. About ~27 lines where is number 4 you should change to the 5: https://github.com/AlexeyAB/darknet/blob/48d461f9bda2d01dc2a83c7c2a520f61b3ea5b79/src/yolo_layer.c

2.1 https://github.com/AlexeyAB/darknet/blob/48d461f9bda2d01dc2a83c7c2a520f61b3ea5b79/src/yolo_layer.c#L84-L109

2.2 https://github.com/AlexeyAB/darknet/blob/48d461f9bda2d01dc2a83c7c2a520f61b3ea5b79/src/yolo_layer.c#L156-L163

  1. https://github.com/AlexeyAB/darknet/blob/48d461f9bda2d01dc2a83c7c2a520f61b3ea5b79/src/data.c#L328-L332

3.1 https://github.com/AlexeyAB/darknet/blob/48d461f9bda2d01dc2a83c7c2a520f61b3ea5b79/src/data.c#L385-L389

  1. https://github.com/AlexeyAB/darknet/blob/48d461f9bda2d01dc2a83c7c2a520f61b3ea5b79/src/image.c#L315-L417

  2. Use C-function void cvFillConvexPoly(CvArr* img, const CvPoint* pts, int npts, CvScalar color, int line_type=8, int shift=0 ) in function draw_detections_cv_v3() to draw rotated rectangle as described here in C++: https://stackoverflow.com/questions/43342199/draw-rotated-rectangle-in-opencv-c

    https://github.com/AlexeyAB/darknet/blob/48d461f9bda2d01dc2a83c7c2a520f61b3ea5b79/src/image.c#L482-L602

  3. https://github.com/AlexeyAB/darknet/blob/48d461f9bda2d01dc2a83c7c2a520f61b3ea5b79/cfg/yolov3.cfg#L776

BGA3 commented 5 years ago

Yeah okay, I see that it is a lot of work for a person that is not (yet) familiar with the code. Damn.

Is this something that would be of interest for the repo, or do you think it is only interresting for my project?

AlexeyAB commented 5 years ago

@BGA3 It would be interesting, but so far not a priority.

BGA3 commented 5 years ago

So, I have done some good progress in adding the alpha/rotation parameter. I can now train and test the network, and the x,y,w,h are found perfectly (as before) but the angle is not. I feel that I am close to resolving this, but I have been stuck for days now. @AlexeyAB would you be interested in looking my modifications (perhaps you can quickly spot the remaining issue)?

AlexeyAB commented 5 years ago

@BGA3 Hi, Yes, I can spend a little time and look at your solution.

BGA3 commented 5 years ago

Ah that sounds great, @AlexeyAB. Thank you very much. I will give you all the likes and stars I can :)

Give me a couple of days to make my modifications readable to others, and to create a new branch. I will revert.

BGA3 commented 5 years ago

@AlexeyAB Im ready with a commit now, that I think is well readable to others :) do you need to give me special acces to the repo? I am getting a "error 403" when trying to push to a new branch.

AlexeyAB commented 5 years ago

@BGA3 You can fork this repo, commit your changes there and press Pull request.

BGA3 commented 5 years ago

ah ofc, like that. I tried branching and pushing using Sourcetree directly from your repo, but ofc I dont have rights to that. I believe I just send you a pull request (Y)

BGA3 commented 5 years ago

@AlexeyAB a small update: I got the rotation output working! When I run darknet.exe directly and also using Darknet.py from python using the CPU build.

However, I have one issue: when I use the GPU dll from python (Spyder/Anaconda) I get the following error: compute_capability = 750, cudnn_half = 1 layer filters size input output 0 conv 16 3 x 3 / 1 640 x 352 x 1 ‑> 640 x 352 x 16 0.065 BF 1 max 2 x 2 / 2 640 x 352 x 16 ‑> 320 x 176 x 16 0.004 BF 2 conv 32 3 x 3 / 1 320 x 176 x 16 ‑> 320 x 176 x 32 0.519 BF 3 max 2 x 2 / 2 320 x 176 x 32 ‑> 160 x 88 x 32 0.002 BF 4 conv 64 3 x 3 / 1 160 x 88 x 32 ‑> 160 x 88 x 64 0.519 BF 5 max 2 x 2 / 2 160 x 88 x 64 ‑> 80 x 44 x 64 0.001 BF 6 conv 128 3 x 3 / 1 80 x 44 x 64 ‑> 80 x 44 x 128 0.519 BF 7 max 2 x 2 / 2 80 x 44 x 128 ‑> 40 x 22 x 128 0.000 BF 8 conv 256 3 x 3 / 1 40 x 22 x 128 ‑> 40 x 22 x 256 0.519 BF 9 max 2 x 2 / 2 40 x 22 x 256 ‑> 20 x 11 x 256 0.000 BF 10 conv 512 3 x 3 / 1 20 x 11 x 256 ‑> 20 x 11 x 512 0.519 BF 11 max 2 x 2 / 1 20 x 11 x 512 ‑> 20 x 11 x 512 0.000 BF 12 conv 1024 3 x 3 / 1 20 x 11 x 512 ‑> 20 x 11 x1024 2.076 BF 13 conv 256 1 x 1 / 1 20 x 11 x1024 ‑> 20 x 11 x 256 0.115 BF 14 conv 512 3 x 3 / 1 20 x 11 x 256 ‑> 20 x 11 x 512 0.519 BF 15 conv 21 1 x 1 / 1 20 x 11 x 512 ‑> 20 x 11 x 21 0.005 BF 16 yolo 17 route 13 18 conv 128 1 x 1 / 1 20 x 11 x 256 ‑> 20 x 11 x 128 0.014 BF 19 upsample 2x 20 x 11 x 128 ‑> 40 x 22 x 128 20 route 19 8 21 conv 256 3 x 3 / 1 40 x 22 x 384 ‑> 40 x 22 x 256 1.557 BF 22 conv 21 1 x 1 / 1 40 x 22 x 256 ‑> 40 x 22 x 21 0.009 BF 23 yolo Total BFLOPS 6.964 Allocate additional workspace_size = 52.43 MB Loading weights from C:/GIT/darknet/build/darknet/backup/yolov3‑tiny_rotated_last.weights...Done! cuDNN Error: CUDNN_STATUS_EXECUTION_FAILED: No error

What is even more weird is, that I can easily use GPU (on the same PC) when I run darknet.exe. Would it be possible for your to have a quick look? I have just made a new commit to my pull request.

dselivanov commented 5 years ago

@BGA3 I've checked your fork and it seems you've removed angle parameter in the last commit. I'm interested in this functionality as well. So if you can push the latest working version, I can take a look and then we can draft a PR here.

Thanks.

BGA3 commented 5 years ago

Hi @dselivanov The thing is, that I unfortunately somewhat abandoned this topic, as my project went in another direction. So at the moment I dont have time to further develop this unfortunately.

dselivanov commented 5 years ago

I've tried to "hack" darknet codebase - it is not easy, parameters are hardcoded in many files and it will take quite a lot of time to make it work. For now I've switched to pytorch, it seems it will be faster to make it work there.

dexception commented 5 years ago

@AlexeyAB Can you please reply... if you have any plans to add this functionality.

BGA3 commented 5 years ago

For information, I ended up implement rotated bounding box detection in python/keras using the yolov2 framework (easier due to only one yolo-layer). And I realized that the learning curve was large for the darknet solution, atleast for my skillset.

AlexeyAB commented 5 years ago

@dexception not yet

zcrnudt commented 4 years ago

I revised the code for rotating ROI detection, but angle is not correct. Can someone help me to correct it ? @AlexeyAB

dselivanov commented 4 years ago

@zcrnudt which code did you revise? what is incorrect? it is completely unclear what do you ask.

zcrnudt commented 4 years ago

Add angle parameter to the model (x,y,w,h) as (x,y,w,h,a), but the angle cannot converge in training step.@dselivanov

jamessmith90 commented 4 years ago

This is pending for last 1 year. Any progress ?

dselivanov commented 4 years ago

@jamessmith90 task for whom? Feel free to contribute code, this is open source project.

jamessmith90 commented 4 years ago

The fact that @BGA3 has not updated the code for rotation.

robosina commented 4 years ago

@AlexeyAB , @dselivanov I have added rotation support in detection mode and make a pull request in this link but I would also like to implement this requested enhancement. As far as I looked in comments I can't find any fork that make progress on it, If there isn't any fork in this issue so I will deploy rotated bounding box from scratch.(I think in some tasks makes convergence faster and maybe better accuracy)

It seems that we have to work on labeling procedure too.

dselivanov commented 4 years ago

@robosina there is some relevant discussion here https://github.com/AlexeyAB/darknet/issues/4360#issuecomment-561694190. In our case we have great success with rotated boxes. We even submitted paper to ECCV this year but got "border reject" due to "not enough scientific novelty" (but at the same time I've seen so many bullshit papers accepted).

However I haven't seen any serious attempts to implement it within darknet framework. Initially (~ 1.5 year ago) I've tried to do so but gave up as calculations related to the b-boxes are hardcoded in quite a lot of places. I'm not that comfortable with darknet codebase.

HFVladimir commented 3 years ago

@dselivanov Hi! I've read all discussions here about rotated bboxes, still haven't found worked solution. Trying to implement it by my own. As I understood you have some working solution on pytorch as shown at #4360. Could you please share some code how you did that or at least that public implementation I could get as starting point. It would be very helpfull for me, if I had it would match it with current C code and make a PR.

dselivanov commented 3 years ago

@HFVladimir unfortunately I can't share code as it belongs to the company. But overall logic is described in #4360 - you need just to add one extra parameter for the angle.

predicted_angle = sigmoid(raw_yolo_layer_output)
angle_loss = sin(predicted_angle - true_angle) ^ 2

We've tried logistic and tanh for sigmoid activation. Haven't noticed any significant difference between them.