请问yolo_training.py中conf的loss计算为什么要只取[noobj_mask.bool()

bubbliiiing / yolo3-pytorch

这是一个yolo3-pytorch的源码，可以用于训练自己的模型。

MIT License

2.01k stars 585 forks source link

请问yolo_training.py中conf的loss计算为什么要只取[noobj_mask.bool() | obj_mask]？ #160

Closed luoh226 closed 2 years ago

luoh226 commented 2 years ago

问题：请问yolo_training.py中conf的loss计算为什么要只取[noobj_mask.bool() | obj_mask]？代码：loss_conf = torch.mean(self.BCELoss(conf, obj_mask.type_as(conf))[noobj_mask.bool() | obj_mask]) 疑问：这么取的意义是什么？YOLOv3本来就是这么做的吗？还是试出来这样子能涨分？我的理解是：noobj_mask代表负样本，obj_mask代表正样本，二者取并集为所有训练样本。忽略的了原本负样本中预测的较好(与gt box的iou大于阈值)的样本。

另一个问题：按照代码里的做法，假设一张图只有一个gt box，那么只取iou最大的anchor box作为正样本(不像Faster R-CNN中iou>阈值为正样本)，其他都为负样本。这么做不会导致很严重的正负样本不平衡吗？为什么没有像Faster R-CNN一样做正负样本平衡？

bubbliiiing commented 2 years ago

1、这个我没看出来有什么问题，正常的正负样本求损失。 2、这个是网络这样设计的。

luoh226 commented 2 years ago

首先，感谢您的答复！然后，原文说到：YOLOv3 predicts an objectness score for each bounding box using logistic regression. This should be 1 if the bounding box prior overlaps a ground truth object by more than any other bounding box prior. If the bounding box prior is not the best but does overlap a ground truth object by more than some threshold we ignore the prediction, following [17]. We use the threshold of .5. Unlike [17] our system only assigns one bounding box prior for each ground truth object. If a bounding box prior is not assigned to a ground truth object it incurs no loss for coordinate or class predictions, only objectness. 文中说忽略iou>0.5的非最大iou框，我理解这里的框是默认的框，请问为什么代码中的框是预测后的框呢？

bubbliiiing commented 2 years ago

鹅。现在忽略的不就是iou>0.5的非最大iou框

luoh226 commented 2 years ago

鹅。现在忽略的不就是iou>0.5的非最大iou框

是的，代码中是忽略iou>0.5的非最大iou框。问题是：忽略初始框(未通过网络输出来偏移框)中与GT box的iou>0.5的框？还是忽略通过网络修正后与GT box的iou>0.5的框？

bubbliiiing commented 2 years ago

忽略通过网络修正后与GT box的iou>0.5的框

luoh226 commented 2 years ago

忽略通过网络修正后与GT box的iou>0.5的框

好的，感谢~