d-li14 / involution

[CVPR 2021] Involution: Inverting the Inherence of Convolution for Visual Recognition, a brand new neural operator
https://arxiv.org/abs/2103.06255
MIT License
1.31k stars 177 forks source link

why must the feature maps maintain the same size H*W?? #38

Open amssljc opened 3 years ago

amssljc commented 3 years ago

If i have not understood Involution, it always keep the same size of the input. That is : Input shape: (B, C, W, H) Output shape: (B, C, W, H) I also confirm this by Involution2d in your Involution.py. if I use dilation=k >1 , kernel size =(1,1), that means I have to use padding=1 to keep the image (or feature map) the same size? In fact, in your code, that means there are H*W patches (kernels):

batch_size, in_channels, height, width = input.shape
# Unfold and reshape input tensor
input_unfolded = self.unfold(self.initial_mapping(input))
input_unfolded = input_unfolded.view(batch_size, self.groups, self.out_channels // self.groups,
                                             self.kernel_size[0] * self.kernel_size[1], height, width)
kernel = self.span_mapping(self.sigma_mapping(self.reduce_mapping(self.o_mapping(input))))
kernel = kernel.view(
            batch_size, self.groups, self.kernel_size[0] * self.kernel_size[1], height, width).unsqueeze(dim=2)

However, I think it does not make sense, I think it should keep the features of convolution that feature maps can shrink by Involution kernels. For example: 微信图片_20210508203212

Sorry for my poor draft, thanks a lot if you can reply me!