machengcheng2016 / CrossRectify-SSOD

Official code of "CrossRectify: Leveraging Disagreement for Semi-supervised Object Detection" (PR'2023)
https://arxiv.org/abs/2201.10734
13 stars 4 forks source link

Comparison with SoftTeacher on COCO 1% #5

Closed vadimkantorov closed 2 years ago

vadimkantorov commented 2 years ago

Is it fair to copmare CT's Table 4 result 18.15 ± 0.13 with SoftTeacher's Table 3 result 20.46 ± 0.39.

Are hparams / valsets / training settings similar?

In this comparison it seems that SoftTeacher is more accurate than CT, but your Appendix D suggests otherwise.

Is CT better than SoftTeacher on 10% but worse on 1%?

Looking forward to your comments about CT vs SoftTeacher . Thank you!

machengcheng2016 commented 2 years ago

Hi, sorry for the late. First, the data augmentation strategies between UBT and SoftTeacher are different, and the latter are more complex and advanced (please check its config file). According to my test, the aug in SoftTeacher can boost the performances of both fully-supervised and semi-supervised training. For example, under COCO-10% fully-supervised, the UBT achieves 23.86 mAP while SoftTeacher achieves 26.94. As a result, it seems unfair to directly compare the performances between these two methods. Second, the frameworks of UBT and SoftTeacher are the same, that is the Teacher-student Mutual Learning (has been proposed in MeanTeacher in 2017). In comparison, our CT framework includes two training detectors with cross correction, which is one of our main contribution. Based on the above discussion, we want to show the absolute improvement only brought by our CT framework, excluding the effect brought by different irrelevent settings such as data aug/learning rate/batch size/unsup weight/etc. So, we choose to compare with only UBT in Table 4, and compare with only SoftTeacher in Appendix D. Note that in each comparison, we keep all irrelevent settings same with the compared method. As you can see, our CT can always outperform other frameworks (except for Teacher-student Mutual Learning, we also compare with Self-Labeling, TS-offline and conventional mutual teaching in Table 5). I think these answer your question.

vadimkantorov commented 2 years ago

Thanks for your comments!

Regarding augmentations, do you have a feeling, how much of improvement can be obtained by just using SoftTeacher's augmentations? (meaning UBT with SoftTeacher's augs, but without SoftTeacher jitter-consistency based filtering)

A separate question: I'm trying to make work teacher-student framework with a DeformableDETR-based detector architecure. Based on your experience, would you think it should work?

machengcheng2016 commented 2 years ago

I remember the authors of SoftTeacher have done ablation study on augmentations in their paper. Please check it out. Well, I think only changing the backbone is not adaquate for publishing a top-conference paper.

vadimkantorov commented 2 years ago

Well, I think only changing the backbone is not adaquate for publishing a top-conference paper.

It's not the main contribution :) I am wondering about how teacher-student improvements are dependent on learning speed. DefDETR models still have slower convergence compared to FasterRCNN. Our early results are not very good, so I'm in debugging and wondering if you tried any transformer-based detector architectures too.

I remember the authors of SoftTeacher have done ablation study on augmentations in their paper. Please check it out.

Thanks! Will check that!