mil-ad / snip

Pytorch implementation of the paper "SNIP: Single-shot Network Pruning based on Connection Sensitivity" by Lee et al.
MIT License
103 stars 15 forks source link

Concern anout the initialization Snip function #3

Closed Shiweiliuiiiiiii closed 4 years ago

Shiweiliuiiiiiii commented 4 years ago

I noticed that you copy the model and reinitialize it to calculate the gradients in SNIP function by "nn.init.xaviernormal(layer.weight)". Thus, the initialization used for pruning is not the exact model used for training. Am I right?

mil-ad commented 4 years ago

Yes. This is strange but it's actually what the snip paper does if you read it carefully. There's a long discussion about this in the openreview comments.

Shiweiliuiiiiiii commented 4 years ago

Thanks for your reply. It is true that Snip can learn architecturally important weights.