Failed to reproduce DKD result with ResNet32x4 / ResNet8x4 on CIFAR100

megvii-research / mdistiller

The official implementation of [CVPR2022] Decoupled Knowledge Distillation https://arxiv.org/abs/2203.08679 and [ICCV2023] DOT: A Distillation-Oriented Trainer https://openaccess.thecvf.com/content/ICCV2023/papers/Zhao_DOT_A_Distillation-Oriented_Trainer_ICCV_2023_paper.pdf

808 stars 123 forks source link

Failed to reproduce DKD result with ResNet32x4 / ResNet8x4 on CIFAR100 #40

Closed cpsu00 closed 1 year ago

cpsu00 commented 1 year ago

Hello, I tried to use the default DKD settings (configs/cifar100/dkd/res32x4_res8x4.yaml) to reproduce the result of accuracy of 76.32 stated in the paper, however I was only able to get 75.8 - 76.19 as shown in the image below. Is there any possible reason for this? Thanks.

Zzzzz1 commented 1 year ago

See issue #7

yjq404 commented 1 year ago

@Zzzzz1 作者您好，我想将dkd加入到单阶段的yolov7中，请问在计算dkd_loss的要使用与target类别数量相对应的logit_student(学生预测类别置信度)和logit_teacher(老师预测类别置信度)，这时候用的logit是pred=model（imge）的logit还是最后经过NMS处理后的置信度，如果是前者的话置信度数量和target数量对不上，如果是后者的话又没有处理后的logit_teacher。希望您能指点一下，另外您觉得这样放在v7中是否有意义？祝大佬学业顺利

Zzzzz1 commented 1 year ago

@Zzzzz1 作者您好，我想将dkd加入到单阶段的yolov7中，请问在计算dkd_loss的要使用与target类别数量相对应的logit_student(学生预测类别置信度)和logit_teacher(老师预测类别置信度)，这时候用的logit是pred=model（imge）的logit还是最后经过NMS处理后的置信度，如果是前者的话置信度数量和target数量对不上，如果是后者的话又没有处理后的logit_teacher。希望您能指点一下，另外您觉得这样放在v7中是否有意义？祝大佬学业顺利

在detection蒸馏中dkd也是被验证有效的，可以在yolov7中使用。在DKD的论文中，我们采用的是NMS前的proposal进行蒸馏。对于YOLO类的单阶段检测器也可以这么用，使用nms前的teacher logits来蒸馏nms前的student logits。

yjq404 commented 1 year ago

@Zzzzz1 感谢您的回复，如果是使用nms前的logits的话，区分目标类和非目标类时，yolo中预测的置信度组数远远大于target中类别数，学艺不精，您有什么好的建议吗？

Zzzzz1 commented 1 year ago

没有太理解这个置信度组数是什么意思？