[code implementation] losses in yolof.py

Wei-i commented 3 years ago

您好，感谢您的工作。我在调试yolof代码中有些有一些不理解希望得到您的帮助。

Wei-i commented 3 years ago

YOLOF/yolof/modeling/yolof.py：

        for i in range(N):
            src_idx, tgt_idx = indices[i] 
            #  pred iou
            iou = box_iou(predicted_boxes[i, ...],
                          gt_instances[i].gt_boxes.tensor) # [h x w x 5, num_gt]
            if iou.numel() == 0:
                max_iou = iou.new_full((iou.size(0),), 0)
            else:
                max_iou = iou.max(dim=1)[0] # [h x w x 5]  
            # anchor iou
            a_iou = box_iou(anchors[i].tensor,
                            gt_instances[i].gt_boxes.tensor) # [h x w x 5, num_gt] 
            if a_iou.numel() == 0:
                pos_iou = a_iou.new_full((0,), 0)
            else:
                pos_iou = a_iou[src_idx, tgt_idx] # [instances x 2 x 4] why? 
            ious.append(max_iou)
            pos_ious.append(pos_iou)

您好，对于这一步我实在是不太理解

a_iou = box_iou(anchors[i].tensor, gt_instances[i].gt_boxes.tensor) # [h x w x 5, num_gt] 
pos_iou = a_iou[src_idx, tgt_idx] # [instances x 2 x 4]

为什么计算anchor的iou后，要做这一步呢？ src_idx, tgt_idx 中包含的是 pred_box 和 anchor对于gt的索引匹配，而代码直接用将这些匹配全用于anchor上？为什么忽略了pred_box呢？

而pred_box的却只算了一个max_iou max_iou = iou.max(dim=1)[0] # [h x w x 5]

chensnathan commented 3 years ago

pred_box is obtained by combining anchor and pred_logits. Thus, the indexes of both pred_box and anchor correspond to the indexes of anchors. E.g, the 1-st pred_box corresponds to the 1-st anchor.

For pred_box, we consider negative ones, thus calculate max_iou. While for positive anchor, as they are already matched with gt boxes, we only need to get their matched iou with their gts.

Wei-i commented 3 years ago

谢谢您的回复，可是我还是不太理解，确实pred_box是通过anchor和pred_logits算出，但是src_idx 是通过pred_box和anchor分别与gt的匹配得来的，比如src_idx, tgt_idx = [5,6,7,8]，[0,1,0,1]对于[5,6]是pred_box上的index，但是[7,8]是anchor上的index，而a_iou[src_idx, tgt_idx]是全取了a_iou[[5,6,7,8]，[0,1,0,1]]. 但我理解的是anchor的a_iou[[7,8],[0,1]] 以及在pred_box是取pred_box[[5,0],[6,1]]；

抱歉，我理解的是即使是index相同，但是pred_box与gt的iou 和 anchor与gt的iou的值还是不同的？（如果都取anchor上的，我发现后续就会有重复的）

chensnathan commented 3 years ago

Yes, there may exist duplicate ones. You can check that by debugging.

Wei-i commented 3 years ago

那为什么不用 anchor的a_iou[[7,8],[0,1]] 以及在pred_box是取pred_box[[5,0],[6,1]]，而非要使用a_iou[src_idx, tgt_idx]呢？

Wei-i commented 3 years ago

我就是通过debug发现会有冗余的结果num_foreground数量就会变少了。因为我真的很不理解，索引是pred以及anchor的但是最后却全部应用于anchor上。明明是indices有一半是pred_box通过uniform_match topk得到的，怎么能作为anchor的呢？只是因为pred_box is obtained by combining anchor and pred_logits？但这样子 anchor上的值也不一样啊，和gt算出来的IOU也是不同的?

Wei-i commented 3 years ago

gt_classes[ignore_idx] = -1
target_classes_o = torch.cat(
    [t.gt_classes[J] for t, (_, J) in zip(gt_instances, indices)]) # 3 * 2 * 4 + 2 * 2 * 4 = 40
target_classes_o[`pos_ignore_idx`] = -1
gt_classes[src_idx] = target_classes_o

我通过调试发现，通过gt_classes[ignore_idx] = -1 约束后，找到了x个ignore_idx对应的索引。但pos_ignore_idx = pos_ious < self.pos_ignore_thresh 是False 如果 src_idx包含了 ignore_idx ，从而gt_classes[ignore_idx]的值就不是-1了。这个也没问题吗.....

chensnathan commented 3 years ago

For the question of why we use all the anchors' indexes instead of the combination of anchors' indexes and predict boxes' indexes: The deep reason behind the ignorance of positive anchors is that we want to ignore the low-quality matched positive anchors as they may cause bad effects on model training, thus we set pos_ignore_threshold to anchors' iou.

For why we set both topk anchors and topk predict boxes as positives: It is true that there exist duplicates between the indexes of topk predicts and the indexes of topk anchors. But the topk predict boxes' non-duplicate indexes serve as additional indexes, which help the model train better. This implementation gives stable and slightly higher performance than only use the topk anchors(37.7 vs. 37.1).

Wei-i commented 3 years ago

Thanks for your reply! But I am still confused that 'We use all the anchors indexes', because I think that 'all anchors indexes' are not only the anchorsindexes, they also contain half of the pred_boxindexes, i.e. (idx, idx1), idx is the index of pred_boxes, while the latter is anchors' . If you want to use all the anchors' indexes instead of the combination of anchors' indexes and predict boxes' indexes, why don't you only use a_iou[idx1, tgt1] ? I think a_iou[src_idx, tgt_idx] may be .....

chensnathan commented 3 years ago

You can try that on your own.

Wei-i commented 3 years ago

Thank you for your help! I will have a try.

WuChannn commented 2 years ago

pred_box is obtained by combining anchor and pred_logits. Thus, the indexes of both pred_box and anchor correspond to the indexes of anchors. E.g, the 1-st pred_box corresponds to the 1-st anchor.

For pred_box, we consider negative ones, thus calculate max_iou. While for positive anchor, as they are already matched with gt boxes, we only need to get their matched iou with their gts.

@ytoon hi, I would like to know why "for pred_box, consider negative ones"? and why max_iou > self.neg_ignore_thresh corresponding to "large IoU (>0.7) negative anchors"(sentence in yolof paper)? when a pred_box has a large iou with gt_box, isn't it a good prediction?

megvii-model / YOLOF

[code implementation] losses in yolof.py #16