Closed 009deep closed 4 years ago
As long as the model structure and the channel number of each split is the same, there is no difference between different code style, and there should be no difference in performance. I didn't check those third party implementations and cannot guarantee that their codes are right. Width/baseWidth is just used to control the channel number in each split. We just follow the previous works such as Res2NeXt to use this code style.
It's regarding following implementation in Bottle2neck :
width = int(math.floor(planes * (baseWidth/64.0)))
In paper, you mention n=w*s but nothing about basewidth?The other approach is implemented here where plan size in always multiples of (w and s). But, that's not official version. I am wondering which one is correct or gives better result?