Hi, thanks for the great work. I have a question about the experiments in predefined structured pruning methods. I am not sure I am understanding the paper correctly.
For predefined structured pruning methods, given the pruning ratio (e.g. 50%), the only difference in different methods is how to find the "least important" channels to prune. But after pruning, they all will result in the same structured pruned models. According to the paper, all these pruned models should have the same performance, even when training from scratch. My question is, if this is true, does this mean that it is meaningless to do those predefined structured pruning since they all lead to the same pruned models which has the same performance. One can just construct a ResNet_0.5x and train from scratch and it will have the same performance as the predefined structured pruning methods. I am looking forward to your reply.
Hi, thanks for the great work. I have a question about the experiments in predefined structured pruning methods. I am not sure I am understanding the paper correctly.
For predefined structured pruning methods, given the pruning ratio (e.g. 50%), the only difference in different methods is how to find the "least important" channels to prune. But after pruning, they all will result in the same structured pruned models. According to the paper, all these pruned models should have the same performance, even when training from scratch. My question is, if this is true, does this mean that it is meaningless to do those predefined structured pruning since they all lead to the same pruned models which has the same performance. One can just construct a ResNet_0.5x and train from scratch and it will have the same performance as the predefined structured pruning methods. I am looking forward to your reply.