Dividing loss by the number of true boxes in a batch

Looking at the loss code https://github.com/longcw/yolo2-pytorch/blob/1b320fa/darknet.py#L243 I see that each loss term is divided by the number of groundtruth boxes in the batch.

Is this correct? I don't see anything corresponding to this logic in the original darknet code.

It seems like this would overly penalize examples with few boxes and if the number of boxes in a batch was zero the loss would go to infinity.

I haven't verified that changing this improves the iou, but it seems like one of the places that could be causing an issue in terms of reproducing the original darknet result.

longcw / yolo2-pytorch

Dividing loss by the number of true boxes in a batch #67