szagoruyko / attention-transfer

Improving Convolutional Networks via Attention Transfer (ICLR 2017)
https://arxiv.org/abs/1612.03928
1.43k stars 274 forks source link

Why not use bn for teacher net in imagenet.py #37

Open cheerss opened 4 years ago

cheerss commented 4 years ago

Thanks for your great work first!

I wonder why you do not use BN layer when inference the teacher model here( https://github.com/szagoruyko/attention-transfer/blob/master/imagenet.py#L117 )? Is it a typo?

Hope for your reply!

somone23412 commented 4 years ago

I have the same question 0.0

somone23412 commented 4 years ago

Hi, I think i have known why there are no BN layers in teacher structure,

"Folded Models below have batch_norm parameters and statistics folded into convolutional layers for speed. It is not recommended to use them for finetuning."

url here