We can also use a square matrix Ws in Eqn.(1). But we will show by experiments that the identity mapping is sufficient for addressing the degradation problem and is economical, and thus Ws is only used when matching dimensions.
In this implementation, the shortcut is only added if the dimensions are unmatched, and based on the option A or B it's decided whether to use an identity shortcut or a projection matrix.
However, the correct implementation is to add the identity layer if it's option A and the dimensions are matched, or to add a projection matrix otherwise.
According to the paper:
In this implementation, the shortcut is only added if the dimensions are unmatched, and based on the option A or B it's decided whether to use an identity shortcut or a projection matrix.
https://github.com/akamaster/pytorch_resnet_cifar10/blob/d5489e8995e81e91ce6b1d69dcc98ad579b0b153/resnet.py#L65-L76
However, the correct implementation is to add the identity layer if it's option A and the dimensions are matched, or to add a projection matrix otherwise.