XinzeLee / RotateObjectDetection

This repository is based on Ultralytics/yolov5, with adjustments to enable rotate prediction boxes.
113 stars 21 forks source link

Inconsistency with inference angles #13

Open heechunlee opened 2 years ago

heechunlee commented 2 years ago

Hi! I was using this project to train a custom dataset but realized that there were inconsistencies in the angle given by cos theta and sin theta values in the outputted labels. So in case I was making an error formatting my own labels, I ran rotate_detect.py on the UCAS-AOD images. Below is the outputted label of 1103.png: ScreenHunter_79 Nov  04 13 32 Recalculating the arccos and arcsin of the cos theta and sin theta values still seems to give different values of the angle. Any idea why this is? Thanks

XinzeLee commented 2 years ago

I am not sure I get your point. You may refer to the following points for reference.

  1. arcsin and arccos give the outputs in radians, not in degree;
  2. To represent the angle (which lies within -90 to 90), two values should be used, cos_theta and sin_theta. To convert it back with one function, you may consider arctan (which is arcsin / arccos).
StarBlue98 commented 2 years ago

Thank you for your excellent work! I've learnt a lot from it! I want to use this model to train my own dataset, but I'm not sure what the right format of the labels are. You use UCAS-AOD dataset, and the labels, for example the first row of 22.txt are as follows: 0 0.255303 0.319417 0.093962 0.099477 0.493946 -0.869493 the first five num means class, x, y, w, h, respectively. I get confused about what the last two num represent? And when I use this model to train my own data, should I modify my labels like (x, y, w, h, re, im)? or the format like (cx, cy, w,h, angle) or (x1,y1,x2,y2,x3,y3,x4,y4) would be OK?

XinzeLee commented 2 years ago

To clarify, for the example label "0 0.255303 0.319417 0.093962 0.099477 0.493946 -0.869493", it represents cx, cy, width, height, cosine_theta, sine_theta respectively, so the last two numbers represent cosine_theta and sine_theta.

For your own dataset, the label formats should be consistent with the examples provided, i.e. cx, cy, width, height, cosine_theta, sine_theta.

heechunlee commented 2 years ago

Sorry I haven't gotten back on this. What I meant to say was, since cos_theta and sin_theta are computed from the same theta value, then wouldn't applying arccos and arcsin to the cos_theta and sin_theta values, respectively, return the same angle? So in the case of the fourth line of the 1103.png label ("0 0.108594 ... 0.989707 0.073235"), arccos(0.989707) = 0.1436 radians while arcsin(0.073235) = 0.0733 rad. This is a difference of a little over 4 degrees, so I was wondering where this inconsistency was originating from? Please let me know if there are problems with my understanding, thanks!

StarBlue98 commented 2 years ago

@heechunlee

few days ago, I have similar confusion. I hope the following code would help you.

theta = math.acos(cos)*180/np.pi

求theta:math.acos(cos)* 180/np.pi

求sin:math.sin(theta* np.pi/180)

By the way, I want to use my own label like (445.8136 222.3948 120.8183 123.9594), which means cx, cy, w, h in pixel, so I wonder how to normalize it?