kuangliu / pytorch-cifar

95.47% on CIFAR10 with PyTorch
MIT License
5.94k stars 2.14k forks source link

This architecture is not following the paper for CIFAR-10 #57

Open PabloRR100 opened 6 years ago

PabloRR100 commented 6 years ago

Hi,

In the ResNet publication they propose a different architecture where it comes to CIFAR-10 right?

The plain/residual architectures follow the form in Fig. 3 (middle/right). The network inputs are 32×32 images, with the per-pixel mean subtracted. The first layer is 3×3 convo- lutions. Then we use a stack of 6n layers with 3×3 convo- lutions on the feature maps of sizes {32, 16, 8} respectively, with 2n layers for each feature map size. The numbers of filters are {16, 32, 64} respectively. The subsampling is per- formed by convolutions with a stride of 2. The network ends with a global average pooling, a 10-way fully-connected layer, and softmax. There are totally 6n+2 stacked weighted layers. The following table summarizes the architecture:

There is only 3 layers and the feature map sizes are [16, 32, 64] not [64, 128, 256, 512] like for ImageNet.

kudkudak commented 6 years ago

Actually noticed the same thing here https://github.com/facebookresearch/mixup-cifar10/issues/3 . On top of that it seems to me that the final BN layer is missing.

akamaster commented 5 years ago

This implementation for pytorch does correspond to the actual paper.

Lornatang commented 4 years ago

Hi,

In the ResNet publication they propose a different architecture where it comes to CIFAR-10 right?

The plain/residual architectures follow the form in Fig. 3 (middle/right). The network inputs are 32×32 images, with the per-pixel mean subtracted. The first layer is 3×3 convo- lutions. Then we use a stack of 6n layers with 3×3 convo- lutions on the feature maps of sizes {32, 16, 8} respectively, with 2n layers for each feature map size. The numbers of filters are {16, 32, 64} respectively. The subsampling is per- formed by convolutions with a stride of 2. The network ends with a global average pooling, a 10-way fully-connected layer, and softmax. There are totally 6n+2 stacked weighted layers. The following table summarizes the architecture:

There is only 3 layers and the feature map sizes are [16, 32, 64] not [64, 128, 256, 512] like for ImageNet.

The implementation of CIFAR in the original ResNet paper is reproduced in my repository, please see (https://github.com/Lornatang/ResNet/tree/master/examples/cifar) Thank you, please give me some suggestions. @PabloRR100