How to understand the Structural re-parameterization of a RepVGG block

DingXiaoH / RepVGG

RepVGG: Making VGG-style ConvNets Great Again

MIT License

3.3k stars 433 forks source link

How to understand the Structural re-parameterization of a RepVGG block #78

Closed jyang68sh closed 2 years ago

jyang68sh commented 2 years ago

Hi! Really nice work!

I was trying to understand the re-parameterization part 2021-11-01_14h09_27

But what I dont get is how identity of 3 X 3 kernel becomes 2 X 1 in the end. I mean, why does it work without information loss?

Any answer is appreciated

GiacomoPinardi commented 2 years ago

@jyang68sh First, consider that the identity branch (in yellow) does not have any parameter. If we want to represent this operation (the identity) we can construct a 3x3 kernel with a weight of 1 in the central cell as shown in the figure. The 2X1 kernel that is see is the bias vector.

Why do only two filters out of four have this weight? Because we are considering an example with C_in = 2 and C_out = 2.

jyang68sh commented 2 years ago

@jyang68sh First, consider that the identity branch (in yellow) does not have any parameter. If we want to represent this operation (the identity) we can construct a 3x3 kernel with a weight of 1 in the central cell as shown in the figure. The 2X1 kernel that is see is the bias vector.

Why do only two filters out of four have this weight? Because we are considering an example with C_in = 2 and C_out = 2.

@GiacomoPinardi Hi thanks for the reply.

Sorry for the late response. This solves my question