group-wise offset learning when deformable_groups > 1

Hi guys, thanks for sharing this nice implementation in advance!

I have a question about group-wise offset learning. In deform.py, when we set deformable_groups > 1, offset-conv also learns the filter weights with same groups. but, in DeformConv2d/forward function, you splits offset/mask like this manner

offset = offset_mask[:, :offset_channel, :, :]
mask = offset_mask[:, offset_channel:, :, :]
mask = mask.sigmoid()  # [0, 1]

I guess this code implemented would ignore spatial ordering of output channels from offset-conv. For example, when deformable_groups = 2, the offset-conv outputs (group_0 output from first half of input channels, group_1 output from last half of input channels) sequentially w.r.t channel dimensions (this is the definition of group convolution in pytorch). So it seems that extracting mask of group0 & group1 from only group_1 ouput doesn't make any sense.

If group-wise offset learning is required, shouldn't this be like below?

# deformable-group-wise splits of offset/mask
offset_mask = offset_mask.view(B, deformable_groups, 3 * kernelH * kernelW, outH, outW)
offset = offset_mask[:, :, :2 * kernelH * kernelW, :, :].view(B, -1, outH, outW)
mask = offset_mask[:, :, 2 * kernelH * kernelW:, :, :].view(B, -1, outH, outW)
mask = mask.sigmoid()  # [0, 1]

haofeixu / aanet

group-wise offset learning when deformable_groups > 1 #54