Why basewidth is divided by constant value of 64?

Res2Net / Res2Net-PretrainedModels

(ImageNet pretrained models) The official pytorch implemention of the TPAMI paper "Res2Net: A New Multi-scale Backbone Architecture"

https://mmcheng.net/res2net/

1.07k stars 215 forks source link

Why basewidth is divided by constant value of 64? #41

Closed 009deep closed 4 years ago

009deep commented 4 years ago

It's regarding following implementation in Bottle2neck :

width = int(math.floor(planes * (baseWidth/64.0))) In paper, you mention n=w*s but nothing about basewidth?
The other approach is implemented here where plan size in always multiples of (w and s). But, that's not official version. I am wondering which one is correct or gives better result?

gasvn commented 4 years ago

As long as the model structure and the channel number of each split is the same, there is no difference between different code style, and there should be no difference in performance. I didn't check those third party implementations and cannot guarantee that their codes are right. Width/baseWidth is just used to control the channel number in each split. We just follow the previous works such as Res2NeXt to use this code style.