hizhangp / yolo_tensorflow

Tensorflow implementation of YOLO, including training and test phase.
MIT License
795 stars 442 forks source link

can you tell me what is the offset means in the loss_layer #53

Open LPaKing opened 6 years ago

LPaKing commented 6 years ago

offset = tf.reshape( tf.constant(self.offset, dtype=tf.float32), [1, self.cell_size, self.cell_size, self.boxes_per_cell]) offset = tf.tile(offset, [self.batch_size, 1, 1, 1]) offset_tran = tf.transpose(offset, (0, 2, 1, 3)) predict_boxes_tran = tf.stack( [(predict_boxes[..., 0] + offset) / self.cell_size, (predict_boxes[..., 1] + offset_tran) / self.cell_size, tf.square(predict_boxes[..., 2]), tf.square(predict_boxes[..., 3])], axis=-1)

lipanpeng commented 5 years ago

I also want to know!

lipanpeng commented 5 years ago

I get it. The network predicts the coordinate of box on one pixel. After reshape 49 pixels to 7x7, we need to get the box's coordinate on the 7x7 feature map. Now I have a new question. Why use the tf.square?

i4yyds commented 5 years ago

I get it. The network predicts the coordinate of box on one pixel. After reshape 49 pixels to 7x7, we need to get the box's coordinate on the 7x7 feature map. Now I have a new question. Why use the tf.square? 你好,假设A物体大小为100,B物体大小为10。如果两个同时都预测错10像素,A和B的损失(110-100, 20-10)是一样的。但是110相对于100只预测错了原始大小的0.1倍,但是20相对于10预测错了2倍。因此采用开根号,根号110-根号100=0.48,根号20-根号10=1.31。代码中有标签转预测和预测转标签因此有的平方有的开根号。