Why do you say "K.binary_crossentropy is helpful to avoid exp overflow"?

https://github.com/qqwweee/keras-yolo3/blob/e6598d13c703029b2686bc2eb8d5c09badf42992/yolo3/model.py#L398

        # K.binary_crossentropy is helpful to avoid exp overflow.
        xy_loss = object_mask * box_loss_scale * K.binary_crossentropy(raw_true_xy, raw_pred[...,0:2], from_logits=True)
        wh_loss = object_mask * box_loss_scale * 0.5 * K.square(raw_true_wh-raw_pred[...,2:4])

You add square loss for wh, I think that is a good job really. But I don't understand why it also need to add binary_crossentropy loss for xy. xy = sigmoid(raw_xy) I think xy con not overflow.

qqwweee / keras-yolo3

Why do you say "K.binary_crossentropy is helpful to avoid exp overflow"? #748