Cc-Hy / CMKD

Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection (ECCV 2022 Oral)
Apache License 2.0
107 stars 9 forks source link

Training process #14

Closed ksh11023 closed 1 year ago

ksh11023 commented 1 year ago

Hello,

I want to ask some details about the training process.

CMKD uses pretrained SECOND-net for the Teacher Network.

When training the CMKD's Student network (CMKD Mono), First, you use ~bev.yaml file to train the model with feature distillation loss

Second, you use ~V2.yaml file to train the model with feature distillation loss + detection loss

+) Also, do you freeze the teacher network(SECOND) and do updates only for the Student network(CMKD Mono) ?

Thank you!

Cc-Hy commented 1 year ago

Hi, for the fisrt question, here is a subsection in our paper explaing for this and you may take a look:

image

Note that in this setting we load the weights from the teacher model and thus it only needs to be finetuned, and 20 epochs are enouhgh. For the second question, the anwser is yes. We fix the teacher model as the feature extractor during training.