Open Rizhiy opened 6 years ago
Did you try it?
I haven't since I don't quite understand the whole codebase and it appears that quite a bit would have to be changed. In particular, it appears that cython code expects it in the current format and I don't have to access to cython source to change it.
It appears that this format was chosen in the fast-rcnn pytorch implementation or maybe even before, so probably would be difficult to change now. I don't think that training accuracy will be affected that much, but may matter if you are trying to win a competition.
Yeap, unfortunately this code is difficult to understand and modify. As I was looking for some localization models in pytorch, I found this repo https://github.com/amdegroot/ssd.pytorch. The model works nicely and the codebase is way easier to understand.
It seems that the ssd.pytorch models use x1,y1,x2,y2
format as well https://github.com/amdegroot/ssd.pytorch/blob/master/data/voc0712.py#L81
I found cython source, so might try to change it later.
It seems that the network uses
x1,y1,x2,y2
format for bounding boxes instead ofx,y,w,h
used in the paper. I think this is a pretty major difference that can affect training accuracy.In
x,y,w,h
format two coordinates are used for centering and two for size, which presents clear separation and can be debugged easily. In the current format, all four coordinates are used for both centering and size, which makes it more difficult to debug.