Closed alexandercantrell closed 3 months ago
Thanks for your interest. 1x1 convolution is identified as linear, you can fuse its weight $W{1\times1}$ into the weight of convolution $W{3\times3}$ by $W{new}=W{1\times1} \times W_{3\times3}$. I suggest you check out more literature on re-parameterization design such as: VanillaNet and DBB.
Thanks for the quick response and the information! After taking another shot at the math I think I finally got it.
I was hoping you might be able to explain a little bit more about how you re-parameterize your RepConv block at inference time. In your paper, you reference Ding et al. 2021 (RepVGG: Making VGG-style ConvNets Great Again), but their block structure is significantly different to the one in your available code seen here:
The RepVGG block has it's convolutions in parallel, whereas this block seems to be sequential which makes it harder to re-parameterize. So far I've been able to figure out how to partially re-parameterize the block, by combining one 1x1 convolution with the 3x3 convolution, but I'm struggling to figure out how you combined that with the final 1x1 convolution. Thanks in advance for your help!