Beta Values - Githubissues

GzyAftermath / CAT-KD

CVPR 2023, Class Attention Transfer Based Knowledge Distillation

32 stars 3 forks source link

Hi,

Thanks for sharing the code and great work!

I have a question regarding beta values (coefficient for CAT loss). In the README it's stated that they are not optimized and one should tune them. However, in the configs I'm finding very different and seemingly precise Betas. E.g. in CAT_KD dir: wrn_40_2_wrn_40_1.yaml: 1.5 wrn_40_2_wrn_16_2.yaml: 12 resnet32x4_shuv2.yaml: 600 ...

Could you please clarify whether these are optimized on some data or they are derived from a function of settings?

Would be great if you could also mention what data, in case of former :)

Thanks a lot! Great work!

GzyAftermath / CAT-KD

Beta Values #2