allanzelener / YAD2K

YAD2K: Yet Another Darknet 2 Keras
Other
2.71k stars 877 forks source link

Why do you have to apply log to box_wh? #112

Open voqtuyen opened 6 years ago

voqtuyen commented 6 years ago

https://github.com/allanzelener/YAD2K/blob/a42c760ef868bc115e596b56863dc25624d2e756/yad2k/models/keras_yolo.py#L424

I see that in the yolo_head calculation, the width and height are calculated as:

    box_wh = K.exp(feats[..., 2:4])

    # Adjust preditions to each spatial grid point and anchor size.
    # Note: YOLO iterates over height index before width index.
    box_wh = box_wh * anchors_tensor / conv_dims

Because of log, width and height become negative. But here in prepocess_true_boxes, you apply a different formula to true_boxes? Can you explain the reason why?

voqtuyen commented 6 years ago

@allanzelener , @shadySource

RRdmlearning commented 6 years ago

Do u get the answer of your question? I have the same question with u!

tianyu-tristan commented 6 years ago

@voqtuyen @RRdmlearning I may have an answer, but I'm still confused on the loss implementation. For your question, let's assume the model raw output (batch, 13, 13, B, 4+1+C), where the 4 coordinates information is (tx, ty, tw, th). The purpose of "yolo_head" is to convert (tx, ty, tw, th) into absolute (bx, by, bw, bh) referring to YOLOv2 paper. However, the "prepocess_true_boxes" on the contrary tries to convert (bx, by, bw, bh) into (tx, ty, tw, th) to compute a square loss. If you notice carefully, those are exactly reverse calculations. I think the reason of doing this is in "yolo_loss" https://github.com/allanzelener/YAD2K/blob/master/yad2k/models/keras_yolo.py#L282 sum of square is computed, and they need to be comparable. https://github.com/allanzelener/YAD2K/blob/master/yad2k/models/keras_yolo.py#L209 indicates the square loss is calculated in the domain of [sigmoid(tx), sigmoid(ty), tw, th]. They seems to be technically comparable by doing that reverse calculation, but I don't know why...

What I also don't understand is, YOLOv2 doesn't seem to mention a new version of loss function, and referring to YOLOv1 the sum of square loss on (w,h) is working on square root domain, but the implementation here is not...

RRdmlearning commented 6 years ago

@tianyu-tristan Thanks for your tips,I check the code of darkflow and yolo2-pytorch, and I found that it is unnecessary to use (tx, ty, tw, th) to calculte the loss.

About the square root domain, the dartflow use it, but yad2k not... https://github.com/thtrieu/darkflow/blob/master/darkflow/net/yolo/train.py#L60 https://github.com/thtrieu/darkflow/blob/master/darkflow/net/yolov2/data.py#L40

You can cheak it yourself. I am also not sure.

voqtuyen commented 6 years ago

@RRdmlearning, Yolo paper refers to using square root for width/height for stability loss optimization, but it seems here the author did not use it.

tianyu-tristan commented 6 years ago

@RRdmlearning @voqtuyen Thanks for sharing. I did change the loss function according to YOLOv1 paper, not much difference though, but still worth to try as it suppose to be more stable. Here's what I did: (1) change https://github.com/allanzelener/YAD2K/blob/master/yad2k/models/keras_yolo.py#L424 to np.sqrt(box[2]/conv_width) and np.sqrt(box[3]/conv_height) (2) change https://github.com/allanzelener/YAD2K/blob/master/yad2k/models/keras_yolo.py#L209 to pred_boxes = K.concatenate((K.sigmoid(feats[..., 0:2]), K.sqrt(pred_wh)), axis=-1)