Open LiuDongyang6 opened 2 years ago
Thank you for the nice work! I wonder if you have tried to use ReviewKD loss and KL-divergence loss together? Will the combination further improve the performance? If yes, would you like to share the results or the hyperparameters?
Sorry for the late reply. I didn't try KL loss. In my opinion, KL loss is good at handling 1d logits instead of large 2d-features.
Thank you for the nice work! I wonder if you have tried to use ReviewKD loss and KL-divergence loss together? Will the combination further improve the performance? If yes, would you like to share the results or the hyperparameters?