Open denissuh opened 4 years ago
hi @denissuh ,
Yes, I will try add support for arbitrary groups
. Howver it may be be a little difficult because
in_channels and out_channels must both be divisible by groups.
# https://pytorch.org/docs/stable/nn.html#torch.nn.Conv2d
If we prune one channel in the output, then we may have to adjust all groups to maintain this constraint.
Thank you very much ! Looking forward to this feature.
c = torch.nn.Conv2d(in_channels=4, out_channels=8, kernel_size=(3, 3), groups=2)
I will use image made by @rasbt from https://discuss.pytorch.org/t/conv2d-certain-values-for-groups-and-out-channels-dont-work/14228/2
Then, as far as I understand there are 2 scenarios:
Pruning Filter
Pruning Channel
What do you think ?
Nice explanation!
However, pytorch does not support groups with different number of filters or channels. I dont know how to deal with this restriction.
In case 1, one filter in the second group (red) should also be pruned because we have to make sure that its out_channel is divisible by 2. In case 2, similarly, we should also prune a red in_channel to insure that in_channel is divisible by 2.
You are absolutely right, thanks for the explanation !
Further to your suggestion, we can request the user to give pruning indices so as to keep the _inchannels and _outchannels be divisible by the group parameter. I'm wondering if it can be done in a user friendly way ?
Another idea is to convert the group convolution to regular convolution. In the ResNeXt paper Aggregated Residual Transformations for Deep Neural Networks figure 3 (below) shows equivalent building blocks of ResNeXt:
Therefore we can convert the group convolution to regular convolution with the following set of operations: split-> conv-> concat
Afterwards we can use the regular pruning mechanism that you've already implemented.
The concern with this idea is that it can be inefficient relative to the group convolution.
What do you think ?
I like the second idea!
Maybe we can split the group convolution first and then merge them after pruning if the in_channel and out_channel is divisible.
BTW, your first idea is also feasible if we apply the same indices to each group. It can be implemented by adding a new index transfrom, which simply broadcasts the indices to other groups.
Hii @VainF @denissuh Can you provide an example to prune resnext or resnest? It will be very helpful.
Thanks
Hi,
@VainF Thank you very much for this project, great work !
I was wondering if you are planning on adding support for conv layer with arbitrary groups parameter (currently there is only support when _groups=in_channels=outchannels - known issue in README) ?
Thank you in advance !