facebookresearch / open_lth

A repository in preparation for open-sourcing lottery ticket hypothesis code.
MIT License
622 stars 113 forks source link

ResNet naming scheme #5

Closed mitchellnw closed 3 years ago

mitchellnw commented 3 years ago

Awesome code, enjoying it!

Probably missing something simple here but I'm a bit confused about resnet naming.

I commonly see wide resnets referred to as WRN-40-2 or WRN-28-10 (e.g. in this repo https://github.com/kakaobrain/fast-autoaugment).

I believe the WRN-40-2 would be, in the nomenclature here, referred to as cifar_resnet_38_32.

However, the resulting network still has 40 conv/linear layers. Why is depth referred to as 38 in open-lth? Are shortcuts not counted since they don't technically count towards "depth"? If so, why does everyone say WRN-40-2?

mitchellnw commented 3 years ago

Very not urgent.. just curious

Thank you! Mitchell

jfrankle commented 3 years ago

Hi Mitchell -

The original ResNet paper and the WRN paper count the layers in different ways, which has led to enormous confusion in the literature. Specifically, the original paper doesn't count skip connections with weights attached to them, but the WRN paper does (e.g., ResNet-20 becomes WRN-22-N). Since I unified these networks into one implementation, I had to choose one or the other, and I went with the original scheme. So subtract 2 from any CIFAR-10 WRN, and you'll get the right name in this repository.

Jonathan