Weight decay will be applied to the weight of BN in identity branch and the weight of FC when using --custwd?

DingXiaoH / RepVGG

RepVGG: Making VGG-style ConvNets Great Again

MIT License

3.3k stars 433 forks source link

Closed wangjinwei94 closed 2 years ago

wangjinwei94 commented 2 years ago

It seems we skipped all parameters other than rbr_identity.weight and linear.weight ( https://github.com/DingXiaoH/RepVGG/blob/main/train.py#L86 ) Is it by-design or not?

DingXiaoH commented 2 years ago

Yes. It is by design. This is because we would not use L2 twice (weight_decay in optimizer + L2 loss in the total loss).

wangjinwei94 commented 2 years ago

Thanks! Sorry I missed the get_custom_L2