all x, y, w, h scales are within 0~1 range (normalized)
VOCDataset
VOCDataset: torch.utils.data.Dataset overrides __getitem__ method to adjust label.txt x, y, w, h values and return appropriate scale's cell-relative coordinates (e.g. [0.9320, 0.4223, 3.0680, 2.6239] is the relative coor to the first anchor box in scale 0)
loop expected bboxes (txt file):
coor_from_txt = [0.764, 0.6069277108433735, 0.23600000000000002, 0.3042168674698795]
IoU_wh(coor_from_txt[2:4],
anchors) # calculate IoU with width and height
IoU_arg_sorted = [0, 5, 1, 4, 3, 2, 8, 7, 6] # coor_from_txt most likely to match the first anchor box in scale0
anchor_indices = IoU_arg_sorted
highest IoU anchor box ratio = (0.28, 0.22) <-- from index 0
index 0 means --> anchor belongs to the first prediction (3, 13, 13, 6) && first anchor box out of three
now RESCALE...
pre-defined anchors (common obj ratio found via K-means)
label txt file
all x, y, w, h scales are within 0~1 range (normalized)
VOCDataset
VOCDataset: torch.utils.data.Dataset
overrides__getitem__
method to adjustlabel.txt
x, y, w, h values and return appropriate scale's cell-relative coordinates (e.g. [0.9320, 0.4223, 3.0680, 2.6239] is the relative coor to the first anchor box in scale 0)RETURN --> img:(C, W, H) && expected_bbox_info:( (3, 13, 13, 6), (3, 26, 26, 6), (3, 52, 52, 6) )
there are 3 scales
S=(13, 26, 52)
scale0 = first prediction(3, 13, 13, 6)
3 anchor boxes each (obj_prob, x, y, w, h, class) scale1 = second prediction(3, 26, 26, 6)
scale3 = third prediction(3, 52, 52, 6)
https://github.com/jl749/YOLOv3/blob/a59f9f79ab558c766a160b0b0660b0724ce015f0/yolov3/datasets/pascal_VOC.py#L69-L99