Open pppLang opened 5 years ago
Yes, it is right since these numbers are divided by the groups
of the convolution.
oh, really thanks for answer~
but, emmmm, i also refer to another implement: https://github.com/kuangliu/pytorch-cifar/blob/master/models/resnext.py
And the numbers of conv with stride 3 is less than that stride 1, even though it is also divided into some groups. I check the network architecture by print(net)
, maybe you want compare the two implement ?
thanks again~
Please make sure that you are executing with the correct commandline parameters. For --cardinality 32 --widen_factor 4 --depth 50 --base_width 4
I get:
(stage_1): Sequential(
(stage_1_bottleneck_0): ResNeXtBottleneck(
(conv_reduce): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn_reduce): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv_conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv_expand): Conv2d(128, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn_expand): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(shortcut): Sequential(
(shortcut_conv): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(shortcut_bn): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
Which exactly corresponds to the paper (https://arxiv.org/pdf/1611.05431.pdf):
Yes, i think when --cardinality 32 --widen_factor 4 --depth 50 --base_width 4
, it is right.
but in your train.py, --cardinality 8--widen_factor 4 --depth 29--base_width 64
, i got :
(stage_1): Sequential( (stage_1_bottleneck_0): ResNeXtBottleneck( (conv_reduce): Conv2d(64, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn_reduce): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv_conv): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=8, bias=False) (bn): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv_expand): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn_expand): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (shortcut): Sequential( (shortcut_conv): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (shortcut_bn): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) )
I think it is wrong, in a bottlenect, the channel number of the middle conv should be half of that of the output conv. But under this parameter setting, it is twice.
sorry, i don't know how to change lines, but you can test by yourself.
Resnext bottlenecks are a bit different, if you ask for a base width of 64 and a cardinality of 8, this is 64*8 = 512. These 512 will be divided in 8 groups of 64 channels.
Maybe I am wrong, could you execute the oiriginal torch code and compare with mine to make it sure?
Running the torch code is a bit troublesome, but i think you are right.
I test a total of three implement code: yours: https://github.com/prlz77/ResNeXt.pytorch/blob/master/models/model.py https://github.com/kuangliu/pytorch-cifar/blob/master/models/resnext.py https://github.com/D-X-Y/ResNeXt-DenseNet/blob/master/models/resnext.py
And your network architecture is same as the third one, but different from the second.
After i finish my work, i will check again. Really thanks for your answer~
https://github.com/prlz77/ResNeXt.pytorch/blob/48c19fba72a0d3971ba9edd6c4e61f860c3df519/models/model.py#L39
Hi, This may be a stupid question. I did not read the original paper, but i think the channels of the conv layer with stride 3 should be less than that with stride 1, to reduce the computational complexity.
I print the channels after line 39:
print(widen_factor, in_channels, D, out_channels)
and the output: 4 64 512 256 4 256 512 256 4 256 512 256 4 256 1024 512 4 512 1024 512 4 512 1024 512 4 512 2048 1024 4 1024 2048 1024 4 1024 2048 1024
Is that right? thanks for answer