VainF / Torch-Pruning

[CVPR 2023] Towards Any Structural Pruning; LLMs / SAM / Diffusion / Transformers / YOLOv8 / CNNs
https://arxiv.org/abs/2301.12900
MIT License
2.62k stars 324 forks source link

Feature Request - Grouped convolutions #9

Open denissuh opened 4 years ago

denissuh commented 4 years ago

Hi,

@VainF Thank you very much for this project, great work !

I was wondering if you are planning on adding support for conv layer with arbitrary groups parameter (currently there is only support when _groups=in_channels=outchannels - known issue in README) ?

Thank you in advance !

VainF commented 4 years ago

hi @denissuh , Yes, I will try add support for arbitrary groups. Howver it may be be a little difficult because

in_channels and out_channels must both be divisible by groups.
# https://pytorch.org/docs/stable/nn.html#torch.nn.Conv2d

If we prune one channel in the output, then we may have to adjust all groups to maintain this constraint.

denissuh commented 4 years ago

Thank you very much ! Looking forward to this feature.

Then, as far as I understand there are 2 scenarios:

  1. Pruning Filter prune_filter

  2. Pruning Channel prune_channel

What do you think ?

VainF commented 4 years ago

Nice explanation!

However, pytorch does not support groups with different number of filters or channels. I dont know how to deal with this restriction.

In case 1, one filter in the second group (red) should also be pruned because we have to make sure that its out_channel is divisible by 2. In case 2, similarly, we should also prune a red in_channel to insure that in_channel is divisible by 2.

denissuh commented 4 years ago

You are absolutely right, thanks for the explanation !

Further to your suggestion, we can request the user to give pruning indices so as to keep the _inchannels and _outchannels be divisible by the group parameter. I'm wondering if it can be done in a user friendly way ?

Another idea is to convert the group convolution to regular convolution. In the ResNeXt paper Aggregated Residual Transformations for Deep Neural Networks figure 3 (below) shows equivalent building blocks of ResNeXt: Equivalent building blocks of ResNeXt

Therefore we can convert the group convolution to regular convolution with the following set of operations: split-> conv-> concat Afterwards we can use the regular pruning mechanism that you've already implemented.
The concern with this idea is that it can be inefficient relative to the group convolution.

What do you think ?

VainF commented 4 years ago

I like the second idea!

Maybe we can split the group convolution first and then merge them after pruning if the in_channel and out_channel is divisible.

VainF commented 4 years ago

BTW, your first idea is also feasible if we apply the same indices to each group. It can be implemented by adding a new index transfrom, which simply broadcasts the indices to other groups.

Garvit-32 commented 3 years ago

Hii @VainF @denissuh Can you provide an example to prune resnext or resnest? It will be very helpful.

Thanks