Hi,thank you for your code firstly but I'm confused why you using ShortcutB(1*1 conv) when the size of feature map is halved. Because as far as I know, ShortcutA(insert 0) is used in the net of cifar10 dataset in the original paper. I have tried to train resnet with ShortcutA with Pytorch on cifar10 dataset, but I couldn't get such a good result as you reported. I don't think it's the problem of shotcut since I couldn't get the result as good as the original paper either. Have you tried ShortcutA, How about the result? Thank you.
Hi,thank you for your code firstly but I'm confused why you using ShortcutB(1*1 conv) when the size of feature map is halved. Because as far as I know, ShortcutA(insert 0) is used in the net of cifar10 dataset in the original paper. I have tried to train resnet with ShortcutA with Pytorch on cifar10 dataset, but I couldn't get such a good result as you reported. I don't think it's the problem of shotcut since I couldn't get the result as good as the original paper either. Have you tried ShortcutA, How about the result? Thank you.