Can Adaptive-KD use with additional attentive sampling at the same time ?

facebookresearch / AlphaNet

AlphaNet Improved Training of Supernet with Alpha-Divergence

Other

97 stars 13 forks source link

Can Adaptive-KD use with additional attentive sampling at the same time ? #7

Closed DehuaTang closed 3 years ago

DehuaTang commented 3 years ago

Hello,

   Can Adaptive-KD use with additional attentive sampling  at the same time ?  Are these two methods orthogonal？

Thank you

dilinwang820 commented 3 years ago

Yes, they're orthogonal techniques. In this work, we find adaptive-KD alone can already give us very promising results and hence, we did not further explore the addition of attentive sampling for the sake of simplicity.

DehuaTang commented 3 years ago

Nice wok ! Thank you for your reply. Adaptive-KD loss work well in my task !

fanliaveline commented 3 years ago

Nice wok ! Thank you for your reply. Adaptive-KD loss work well in my task !

Hello! Did you use the Adaptive-KD loss with attentive sampling in your task?