facebookresearch / unbiased-teacher

PyTorch code for ICLR 2021 paper Unbiased Teacher for Semi-Supervised Object Detection
https://arxiv.org/abs/2102.09480
MIT License
409 stars 84 forks source link

Question about pseudo-labels generation #69

Open vadimkantorov opened 2 years ago

vadimkantorov commented 2 years ago

Hello @ycliu93!

I'm trying to understand if the code uses joint_proposal_dict["proposals_pseudo_rpn"] or pesudo_proposals_rpn_unsup_k anywhere. I couldn't grep any usage. Was it used for debugging? Can I safely delete the call (pesudo_proposals_rpn_unsup_k, nun_pseudo_bbox_rpn,) = self.process_pseudo_label(proposals_rpn_unsup_k, cur_threshold, "rpn", "thresholding")?

Thank you!

ycliu93 commented 2 years ago

Hi @vadimkantorov,

Yes, you can directly remove this line, and it won't affect the framework.

Btw, the reason why I have this line is that I tried to use the predictions of the Teacher's RPN to supervise the Student's RPN, but it does not help.

Thanks!

vadimkantorov commented 2 years ago

Thanks for clarifications!

vadimkantorov commented 2 years ago

Do I understand correctly that in theory multiple class labels could be predicted for a single pseudo-box (if they are both somehow erroneously exceeding the threshold) in https://github.com/facebookresearch/unbiased-teacher/blob/main/ubteacher/engine/trainer.py#L422?

I.e. we rank the (box, class label) pairs independently, right?

vadimkantorov commented 2 years ago

In the paper mentioned: Similar to classification-based methods, to prevent the consecutively detrimental effect of noisy pseudo-labels (i.e., confirmation bias or error accumulation), we first set a confidence threshold δ of predicted bounding boxes to filter lowconfidence predicted bounding boxes, which are more likely to be false positive samples. While the confidence threshold method have achieved tremendous success in the image classification, it is however not sufficient for object detection. This is because there also exist duplicated box predictions and imbalanced prediction issues in the SS-OD (we leave the discussion of the imbalanced prediction issue in Sec. 3.3). To address the duplicated boxes prediction issue, we remove the repetitive predictionsby applying class-wise non-maximum suppression (NMS) before the use of confidence thresholding as performed in STAC (Sohn et al., 2020b).

It makes reference to STAC that does use NMS to filter teacher-generated pseudo-labels. However, in the released code I couldn't find any calls to NMS https://github.com/facebookresearch/unbiased-teacher/blob/main/ubteacher/engine/trainer.py#L537 .

@ycliu93 Could you please clarify if NMS is used (or not?) prior to confidence thresholding of teacher-generated pseudo-targets?

Thank you!

vadimkantorov commented 2 years ago

Is NMS applied as part of self.proposal_generator or self.roi_heads at https://github.com/facebookresearch/unbiased-teacher/blob/ba543ed2b87446c79eaf41f93320de246c8eedc2/ubteacher/modeling/meta_arch/rcnn.py#L41?

What is the NMS threshold used?

Is NMS applied to filter student-produced detections as well? Asking because the code for branch == "supervised" and for branch == "unsup_data_weak" is very similar...

Thanks!

ycliu93 commented 2 years ago

Hi @vadimkantorov ,

Yes, we did NMS before the confidence thresholding, and NMS performs in the inference mode of Faster-RCNN (which we use the default implementation in detectron2).

Please check this

https://github.com/facebookresearch/unbiased-teacher/blob/ba543ed2b87446c79eaf41f93320de246c8eedc2/ubteacher/modeling/meta_arch/rcnn.py#L46-L53

compute_loss is False, so refers to

https://github.com/facebookresearch/unbiased-teacher/blob/ba543ed2b87446c79eaf41f93320de246c8eedc2/ubteacher/modeling/roi_heads/roi_heads.py#L137

self.box_predictor.inference refers to detectron2's implementation

https://github.com/facebookresearch/detectron2/blob/7cad0a7d95cc8b0c7974cc19e50bded742183555/detectron2/modeling/roi_heads/fast_rcnn.py#L361-L382

and

https://github.com/facebookresearch/detectron2/blob/7cad0a7d95cc8b0c7974cc19e50bded742183555/detectron2/modeling/roi_heads/fast_rcnn.py#L45-L84

https://github.com/facebookresearch/detectron2/blob/7cad0a7d95cc8b0c7974cc19e50bded742183555/detectron2/modeling/roi_heads/fast_rcnn.py#L161

I understand that this is complicated in terms of code implementation, but we try to follow the default setup in the Detectron2 codebase.

Hope this can help you, and let me know if you have other questions. Thanks!

vadimkantorov commented 2 years ago

Thank you for these pointers!

in practice, what are the hyper-parameters used for fast_rcnn_inference(...) function?

            self.test_score_thresh,
            self.test_nms_thresh,
            self.test_topk_per_image,

are these populated from some config?

or default are used?

        test_score_thresh: float = 0.0,
        test_nms_thresh: float = 0.5,
        test_topk_per_image: int = 100,

another params I found are in https://github.com/facebookresearch/detectron2/blob/e3053c14cfcce515de03d3093c0a7b4e734fa41f/detectron2/config/defaults.py#L278

vadimkantorov commented 2 years ago

Do I understand correctly that pseudo-labels are provided only for classification head?

Do you do any guards for the case if no pseudo-labels left after the filtering?

Thanks!

vadimkantorov commented 2 years ago

Do I understand correctly that labeled images are always taking 50% of the batch? (seems 32 labeled and 32 unlabeled)

vadimkantorov commented 2 years ago

I'm trying to understand how data loading / sampling is performed, in particular AspectRatioGroupedSemiSupDatasetTwoCrop. Do I understand correctly that it samples independently and endlessly over labeled and unlabeled datasets and returns a batch once it collected enough examples?

Are underlying labeled and unlabeled datasets shuffled? Or are they sampled from with replacement? Where is shuffling performed?

Is there a concept of epoch? Is the number of batches determined by the smaller dataset?

I'm asking because the labeled dataset is typically much smaller (than the unlabeled one), so I wonder how the it's oversampled / resampled?

Thanks!