chengdazhi / Deformable-Convolution-V2-PyTorch

Deformable ConvNets V2 (DCNv2) in PyTorch
MIT License
1.44k stars 229 forks source link

Number of output channels of conv_offset_mask #65

Open bluesky314 opened 4 years ago

bluesky314 commented 4 years ago

The number of channels for offset and modulation usually has 27 channel output as kernal size is 3. But according to my understanding of paper, for each channel we a 3 channel predictions i.e shift in x,y and the modulation factor. So the output channels should be 3 times the input however that is not what is coded in:

self.conv_offset_mask = nn.Conv2d(self.in_channels,
                                          self.deformable_groups * 3 * self.kernel_size[0] * self.kernel_size[1],
                                          kernel_size=self.kernel_size,
                                          stride=(self.stride, self.stride),
                                          padding=(self.padding, self.padding),
                                          bias=True)

The paper says

The output is of 3K channels, where the first 2K channels correspond to the learned offsets and the remaining K channels are further fed to a sigmoid layer to obtain the modulation scalars

Can someone clarify what I am missing and because this layer outputs 27 channels no matter the input dimension so how is this used for shift x,y and modulation?