JCZ404 / Semi-DETR

[CVPR 2023] Official implementation of the paper "Semi-DETR: Semi-Supervised Object Detection with Detection Transformers"
https://arxiv.org/abs/2307.08095
MIT License
72 stars 9 forks source link

Cost-based Pseudo Label Mining & cross view consisitency leanring #8

Closed SISTMrL closed 8 months ago

SISTMrL commented 1 year ago

作者你好,我不是很懂Cost-based Pseudo Label Mining这个模块到底是怎么work,为啥运用高斯混合模型就可以挑出好的pseudo label呢,这背后的数学原理是啥

另外进行consistency learning时,你生成了cross view query, 但是只有consistency监督,standard query还是进行正常解码并用pseudo box进行监督。并且cross view & standard query用mask隔开,我没法理解这里的cross view query是如何帮助standard query学习到consistency的信息

诚盼回复,谢谢!

JCZ404 commented 1 year ago

Hello, and thank you for your interest in our work. (1) The primary objective behind our proposed consistency approach is to enhance the effectiveness of the query-based decoder in aggregating relevant features from the image feature for the input query, rather than to help the standard query to learn the consistency feature. Actually, our scheme shares a similar spirit with DINO. DINO focuses on increasing the decode robustness (i.e. eliminating the results of query-matching various within different decoder layers). To achieve this, DINO introduces additional denoising queries and ensures that these denoising queries produce consistency predictions without adding noise, essentially making them a form of consistency query. But different from DINO, we force decoded cross-view queries which take different augmented features for the same object as prior to being consistent. Consequently, after consistency training, the decoder becomes more proficient at searching for and aggregating feature semantics that closely resemble the input query. (2) Regarding cost-based pseudo-label mining, it serves as a mechanism to select appropriate query pairs for computing the consistency loss. This choice is crucial because after augmentation, some objects may become challenging to detect (e.g., due to shear augmentation). Thus, it would be incorrect to impose the consistency loss on such query pairs. In such cases, these objects would exhibit a high matching cost and can be effectively filtered out using the GMM-based threshold. I hope this clarifies the key distinctions and motivations behind our approach. If you have any further questions or require additional details, please feel free to ask.

SISTMrL commented 1 year ago

感谢大佬的回复,可以加个微信吗,方便请教下,我的v:15256956711