Why the involution could summarize the context in a wider spatial arrangement?

d-li14 / involution

[CVPR 2021] Involution: Inverting the Inherence of Convolution for Visual Recognition, a brand new neural operator

https://arxiv.org/abs/2103.06255

MIT License

1.31k stars 177 forks source link

Why the involution could summarize the context in a wider spatial arrangement? #47

Open ddamddi opened 3 years ago

ddamddi commented 3 years ago

Nice work of rethinking conv modules.

The question is why could involution summarize the context into a wider spatial array?

In my view, only the process of changing 3x3 convolution of ResNet to 7x7 involution to create RedNet seems to be the only factor of wider receptive field.

Is there any inherent nature of involution for summarizing the context into a wider spatial array?