megvii-research / mdistiller

The official implementation of [CVPR2022] Decoupled Knowledge Distillation https://arxiv.org/abs/2203.08679 and [ICCV2023] DOT: A Distillation-Oriented Trainer https://openaccess.thecvf.com/content/ICCV2023/papers/Zhao_DOT_A_Distillation-Oriented_Trainer_ICCV_2023_paper.pdf
808 stars 123 forks source link

DKD可否用于其他领域 #17

Closed DonMuv closed 2 years ago

DonMuv commented 2 years ago

DKD看起来是对全连接层输出的logits进行蒸馏的,请问是否可以应用在没有全连接层的网络中,但看起来需要通道对齐操作是吗

Zzzzz1 commented 2 years ago

这个是指feature蒸馏吗?feature蒸馏的方法是不适合dkd的,但是可以考虑试试利用CAM等手段决定对应各个类别对应channel的attention来帮助提升效果。