haofeixu / aanet

[CVPR'20] AANet: Adaptive Aggregation Network for Efficient Stereo Matching
Apache License 2.0
521 stars 100 forks source link

group-wise offset learning when deformable_groups > 1 #54

Closed seheevic closed 1 year ago

seheevic commented 3 years ago

Hi guys, thanks for sharing this nice implementation in advance!

I have a question about group-wise offset learning. In deform.py, when we set deformable_groups > 1, offset-conv also learns the filter weights with same groups. but, in DeformConv2d/forward function, you splits offset/mask like this manner

offset = offset_mask[:, :offset_channel, :, :]
mask = offset_mask[:, offset_channel:, :, :]
mask = mask.sigmoid()  # [0, 1]

I guess this code implemented would ignore spatial ordering of output channels from offset-conv. For example, when deformable_groups = 2, the offset-conv outputs (group_0 output from first half of input channels, group_1 output from last half of input channels) sequentially w.r.t channel dimensions (this is the definition of group convolution in pytorch). So it seems that extracting mask of group0 & group1 from only group_1 ouput doesn't make any sense.

If group-wise offset learning is required, shouldn't this be like below?

# deformable-group-wise splits of offset/mask
offset_mask = offset_mask.view(B, deformable_groups, 3 * kernelH * kernelW, outH, outW)
offset = offset_mask[:, :, :2 * kernelH * kernelW, :, :].view(B, -1, outH, outW)
mask = offset_mask[:, :, 2 * kernelH * kernelW:, :, :].view(B, -1, outH, outW)
mask = mask.sigmoid()  # [0, 1]
haofeixu commented 1 year ago

Seems what you mentioned is valid, but I feel it might not make too many difference though I haven't compared.