sseung0703 / KD_methods_with_TF

Knowledge distillation methods implemented with Tensorflow (now there are 11 (+1) methods, and will be added more.)
MIT License
266 stars 61 forks source link

Can you consider adding the method of attention transfer and neuron-selectivity-transfer? #1

Closed Xiaocong6 closed 5 years ago

Xiaocong6 commented 5 years ago

Thanks a lot for the code! I read some papers about distillation and noticed that the two methods are not compared in your code. Can you consider adding them? My programming skills are poor(_), I still need to learn from you. attention transfer https://arxiv.org/abs/1612.03928 neuron-selectivity-transfer https://arxiv.org/abs/1707.01219 In addition, I think it is better to specify the structure of the student network and the teacher network in the form. Thanks again.

Xiaocong6 commented 5 years ago

sorry, I saw the network structure, I didn’t pay attention to the Korean part just now.

sseung0703 commented 5 years ago

i have a plan for attention transfer but not sure for neuron-selectivity-transfer. and i will translate all part for English soon. :)

sseung0703 commented 5 years ago

I added attention transfer. :)

Xiaocong6 commented 5 years ago

Thanks a lot!