jl749 / YOLOv3

yolov3 implementation in pytorch (https://arxiv.org/pdf/1804.02767.pdf)
0 stars 0 forks source link

YOLO dataset #2

Open jl749 opened 2 years ago

jl749 commented 2 years ago

pre-defined anchors (common obj ratio found via K-means)

anchors = tensor([[0.2800, 0.2200],  # pre defined
                  [0.3800, 0.4800],
                  [0.9000, 0.7800],
                  [0.0700, 0.1500],
                  [0.1500, 0.1100],
                  [0.1400, 0.2900],
                  [0.0200, 0.0300],
                  [0.0400, 0.0700],
                  [0.0800, 0.0600]])

label txt file

# class, x, y, w, h
8 0.764 0.6069277108433735 0.23600000000000002 0.3042168674698795
8 0.594 0.6159638554216867 0.188 0.29819277108433734
14 0.229 0.6445783132530121 0.166 0.45180722891566266
14 0.39 0.6430722891566265 0.168 0.4307228915662651
14 0.5650000000000001 0.5918674698795181 0.154 0.41867469879518077
14 0.787 0.5963855421686747 0.166 0.3855421686746988

all x, y, w, h scales are within 0~1 range (normalized)

VOCDataset

VOCDataset: torch.utils.data.Dataset overrides __getitem__ method to adjust label.txt x, y, w, h values and return appropriate scale's cell-relative coordinates (e.g. [0.9320, 0.4223, 3.0680, 2.6239] is the relative coor to the first anchor box in scale 0)

RETURN --> img:(C, W, H) && expected_bbox_info:( (3, 13, 13, 6), (3, 26, 26, 6), (3, 52, 52, 6) )

loop expected bboxes (txt file):
    coor_from_txt = [0.764, 0.6069277108433735, 0.23600000000000002, 0.3042168674698795]
    IoU_wh(coor_from_txt[2:4],
                          anchors)  # calculate IoU with width and height
    IoU_arg_sorted = [0, 5, 1, 4, 3, 2, 8, 7, 6]  # coor_from_txt most likely to match the first anchor box in scale0
    anchor_indices = IoU_arg_sorted

    highest IoU anchor box ratio = (0.28, 0.22) <-- from index 0
    index 0 means --> anchor belongs to the first prediction (3, 13, 13, 6) && first anchor box out of three

    now RESCALE...

image there are 3 scales S=(13, 26, 52) scale0 = first prediction (3, 13, 13, 6) 3 anchor boxes each (obj_prob, x, y, w, h, class) scale1 = second prediction (3, 26, 26, 6) scale3 = third prediction (3, 52, 52, 6) https://github.com/jl749/YOLOv3/blob/a59f9f79ab558c766a160b0b0660b0724ce015f0/yolov3/datasets/pascal_VOC.py#L69-L99

jl749 commented 2 years ago

https://github.com/jl749/YOLOv3/blob/852a785742dd6fbfad6d1d93775c119ae88c999a/yolov3/config.py#L34-L38

unnormalize anchors by multiplying scales (anchors --> scaled_anchors) https://github.com/jl749/YOLOv3/blob/a59f9f79ab558c766a160b0b0660b0724ce015f0/yolov3/datasets/pascal_VOC.py#L121-L134