longcw / yolo2-pytorch

YOLOv2 in PyTorch
1.54k stars 421 forks source link

Coordinate system mismatch? #43

Open TrentWeiss opened 6 years ago

TrentWeiss commented 6 years ago

The instructions for using custom datasets say:

"The four values in each row should correspond to x_bottom_left, y_bottom_left, x_top_right, and y_top_right"

However, the tags in the VOC dataset appear to be

(xmin, ymin) -> top left of the object (xmax, ymax) -> bottom right of the object

And the way this data is read in appears to leave that convention unchanged. https://github.com/longcw/yolo2-pytorch/blob/master/datasets/pascal_voc.py#L156

Am I missing something?

ds2268 commented 6 years ago

I can say, that you are correct :) I also strongly suggest you using PyTorch Dataset class for loadig the data. Way easier and nicer, special care needs to be taken just for the custom collate function as the number of boxes per image is of course not consistent and default collate function doesn't like that.

[ left_top_x, left_top_y, right_bottom_x, right_bottom_y ]