Open vq12 opened 2 months ago
hi, train_CSAKD.py is the default implementation for our CSAKD, our CSAKD uses the feature map distillation loss to align the feature map between a teacher and student model. Therefore, it is online knowledge distillation training and it archieve more greater results.
This repository needs more maintanence, I will update more detail information in a few day! Thanks!
Hello, I noticed there are two training scripts in the repository: train_CSAKD.py and train_CSAKD_offline_teacher_student.py. Could you please clarify the differences between these two scripts? I'm curious to know which training method is utilized in your implementation. Any insights would be greatly appreciated. Thank you!