Open aidevmin opened 1 year ago
- In this link, you said that BNScalePruner and GroupNormPruner supports sparse training. It means that we need to train pretrained model again with at least 1 epoch. It changes pretrained model parameters. Is that right?
Yes, it forces some unimportant parameters to be 0.
- In this benchmark table https://github.com/VainF/Torch-Pruning/tree/master/benchmarks, I saw that some methods implemented by you such as Group-L1, Group-BN, Group-GReg, Ours w/o SL and Ours. As my understanding, all the above methods estimate importance of parameters:
Yes.
- Are all pruners in your repo group-level? I am confused because when I read the code, for example group-level L1 you used tp.pruner.MagnitudePruner, group-level BN you used tp.pruner.BNScalePruner, these 2 API pruner names are without Group. But for group-level Group pruner you used tp.pruner.GroupNormPruner with Group in the API name. Please correct me.
Yes, all pruner is able to estimate group importance and remove grouped parameters by default.
- Your contribution is DepGraph and new pruning method GroupPruner with sparse learning (based on L2 norm)? Is that right? If it is right, so GroupPruner without sparse learning is same as tp.pruner.MagnitudePruner with L2 importance?
Right. Both GroupNormPruner
and MagnitudePruner
inherent tp.pruner.MetaPruner
.The only difference is that GroupNormPruner
has an interface for sparse training.
- As my understanding, tp.pruner.MagnitudePruner is group-level for Conv layers, tp.pruner.BNScalePruner is group-level for BN layers, and tp.pruner.GroupNormPruner for Conv, BN, Linear layers. Is that right?
Yes.
@VainF Thank you so much for quick response. I got it.
@VainF Do you have any recommed for number of epochs with sparse training? If it is large, so it take much time for normal training + sparse training before pruning.
Thanks @VainF for amazing repo.
I read your paper and see some API pruner, but something is confused with pruner.
In this link, you said that BNScalePruner and GroupNormPruner supports sparse training. It means that we need to train pretrained model again with at least 1 epoch. It changes pretrained model parameters. Is that right?
In this benchmark table https://github.com/VainF/Torch-Pruning/tree/master/benchmarks, I saw that some methods implemented by you such as Group-L1, Group-BN, Group-GReg, Ours w/o SL and Ours. As my understanding, all the above methods estimate importance of parameters:
Are all pruners in your repo group-level? I am confused because when I read the code, for example group-level L1 you used
tp.pruner.MagnitudePruner
, group-level BN you usedtp.pruner.BNScalePruner
, these 2 API pruner names are withoutGroup
. But for group-level Group pruner you usedtp.pruner.GroupNormPruner
withGroup
in the API name. Please correct me.Your contribution is DepGraph and new pruning method GroupPruner with sparse learning (based on L2 norm)? Is that right? If it is right, so GroupPruner without sparse learning is same as
tp.pruner.MagnitudePruner
with L2 importance?As my understanding,
tp.pruner.MagnitudePruner
is group-level for Conv layers,tp.pruner.BNScalePruner
is group-level for BN layers, andtp.pruner.GroupNormPruner
for Conv, BN, Linear layers. Is that right?Sorry for my not good English.