The question is why could involution summarize the context into a wider spatial array?
In my view, only the process of changing 3x3 convolution of ResNet to 7x7 involution to create RedNet seems to be the only factor of wider receptive field.
Is there any inherent nature of involution for summarizing the context into a wider spatial array?
Nice work of rethinking conv modules.
The question is why could involution summarize the context into a wider spatial array?
In my view, only the process of changing 3x3 convolution of ResNet to 7x7 involution to create RedNet seems to be the only factor of wider receptive field.
Is there any inherent nature of involution for summarizing the context into a wider spatial array?