about the implementation.. view instead of permute ?

oeway / pytorch-deform-conv

PyTorch implementation of Deformable Convolution

MIT License

911 stars 151 forks source link

about the implementation.. view instead of permute ? #16

Open chenchr opened 6 years ago

chenchr commented 6 years ago

Hello. Thanks for sharing the code. I have a question about the implementation of offset, in [https://github.com/oeway/pytorch-deform-conv/blob/master/torch_deform_conv/deform_conv.py#L182] the code :

offsets = offsets.view(batch_size, -1, 2)

the input tensor offsets in b (2c) h * w after first normal conv, I think the offset of defom-conv is the output channels, therefore is the code should be ? :

offsets = offsets.view(b, 2*c, h, w)
offsets = offsets.permute(0, 2, 3, 1)

FCInter commented 4 years ago

No. Go to the torch_deform_conv/layers.py. Have a look at the forward() function of the class ConvOffset2D. Just print the shapes of every tensor, you'll find the implementation is right.

The following is the shape I printed:

def forward(self, x):
        """Return the deformed featured map"""
        x_shape = x.size() # [32, 32, 28, 28]
        offsets = super(ConvOffset2D, self).forward(x) # [32, 64, 28, 28]

        # offsets: (b*c, h, w, 2)
        # print('x1', x.shape)
        # print('offsets1', offsets.shape)
        offsets = self._to_bc_h_w_2(offsets, x_shape) # [1024, 28, 28, 2]
        # print('offsets2', offsets.shape)

        # x: (b*c, h, w)
        x = self._to_bc_h_w(x, x_shape) # [1024, 28, 28]
        # print('x2', x.shape)

        # X_offset: (b*c, h, w)
        x_offset = th_batch_map_offsets(x, offsets, grid=self._get_grid(self,x)) # [1024, 784]
        # print('x_offset1', x_offset.shape)

        # x_offset: (b, h, w, c)
        x_offset = self._to_b_c_h_w(x_offset, x_shape) # [32, 32, 28, 28]
        # print('x_offset2', x_offset.shape)

        return x_offset

giangbang commented 3 years ago

i think this's a bug, since we're messing up between spatial dimensions (2*c) and channel dimensions (h, w). The correct way should be:

# offset.shape = b , 2*c, h, w
offset = offset.view(b, c, 2, h, w).permute([0, 1, 3, 4, 2])
# offset.shape now is b, c, h, w, 2
# note that even when we get correct shape doesn't mean we're using them in the right order.

giangbang commented 3 years ago

But it seems that this implementation is wrong as pointed in other issues.

oeway commented 3 years ago

Hi, thanks for looking into this.

This was a direct port from tensorflow to pytorch before the author of deformable conv release their code, and I am aware that there is some issue with the original tensorflow implementation. However, I haven't worked on this for a while, and it would be great if someone can submit a PR here.

giangbang commented 3 years ago

@oeway to my surprise, there is an official implementation of deformable conv in Pytorch. Check it out at https://pytorch.org/vision/stable/_modules/torchvision/ops/deform_conv.html