Open joyhuang9473 opened 7 years ago
We show that activation-based attention transfer gives better improvements than full activation transfer, and can be combined with knowledge distillation
Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer