Cc-Hy / CMKD

Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection (ECCV 2022 Oral)
Apache License 2.0
107 stars 9 forks source link

On the use of 'gt_mask' #33

Closed carlvinson70 closed 1 year ago

carlvinson70 commented 1 year ago

Hi, could you please explain how 'gt_mask' can be utilised? Apparently it shows up in the cmkd.py script and doesn't seem to be used in training. Many thanks.

Cc-Hy commented 1 year ago

Hi, @carlvinson70 The gt_mask is used earlier in the framework, and we remove it later because we suppose that our framework is trained with unlabeled data and the gt labels are not available. If you want to use it, you can only use it with the labeled samples, specifically, you can generate a binary mask in the BEV space to indicate whether a BEV position is occupied by foreground objects so that you can use a larger loss weight. But I remembered this does not give an improvement in performance, you can have a try if you are interested.

carlvinson70 commented 1 year ago

Thanks for the info. @Cc-Hy

Another quick question: is the IoU confidence score weighting (that is, 's' in equation 4 & 5) implemented in the current code?

Cc-Hy commented 1 year ago

@carlvinson70 Hi, 's' is implemented for the regression loss, but not the classification loss in the current code. See this line.

carlvinson70 commented 1 year ago

@carlvinson70 Hi, 's' is implemented for the regression loss, but not the classification loss in the current code. See this line.

From what I understand from this line, it is using the sum of "logits" of all 3 classes as a confidence score? This seems different from the descriptions in the paper, if I understand it right? And how is this confidence IoU-related?

Cc-Hy commented 1 year ago

's' here is the predicted classification confidence score of the LiDAR-based teacher model. When training the teacher model, we use the quality labels like IoU or centerness as the classification ground truth, so the predicted confidence scores of the soft labels are IoU-related or centerness-related which are used to weight the losses.