longcw / yolo2-pytorch

YOLOv2 in PyTorch
1.55k stars 420 forks source link

Box_Mask question #21

Open youssefzaky opened 7 years ago

youssefzaky commented 7 years ago

A question about the implementation of targets in this project. Why do the default box targets have values of 0.5 for x, y and 1 for width and height? And the box mask has default weights of 0.01? Does this provide incentive for predicted boxes that are unmatched to ground truth to simply predict prior boxes? Is this in the original implementation?

cory8249 commented 7 years ago

I think these codes are not necessary if you follow #23. You can remove all these default number. Just let it initialized to zeros: _boxes = np.zeros([hw, num_anchors, 4], dtype=np.float) _box_mask = np.zeros([hw, num_anchors, 1], dtype=np.float)

In my experiments(w/ or w/o these default values), the trained model mAP is almost the same.

I think these numbers come from: delta_region_box coord_scale * (2 - truth.w*truth.h)

Not really sure why YOLO's author did it. net.seen is the number of images seen by network in training process. 12800 is roughly 0.8 epoch (VOC07+12 has 16,651 images). Maybe it's kind of curriculum learning I guessed.