gcr / torch-residual-networks

This is a Torch implementation of ["Deep Residual Learning for Image Recognition",Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun](http://arxiv.org/abs/1512.03385) the winners of the 2015 ILSVRC and COCO challenges.
zlib License
576 stars 145 forks source link

have u do data argumentation on cifar during training? #7

Closed eriche2016 closed 8 years ago

gcr commented 8 years ago

Hello!

We used the same preprocessing strategy as in the paper by cropping 2px off all sides. See lines 60-71 of https://github.com/gcr/torch-residual-networks/blob/master/data/cifar-dataset.lua#L60-71

eriche2016 commented 8 years ago

hi, i have also implemented residual net, but some how my test error can only drop to around 9 percent. the only difference is that my implementation of skip path when input feature mapd number and ourput of residual block is different is using convolution of kernel size1. is it a problem? or just because my initialization. of parameters is not . good . so that i . need . to . wait . more . than 80 epochs to decrease lr by10

eriche2016 commented 8 years ago

hi, i have also implemented residual net, but some how my test error can only drop to around 9 percent. the only difference is that my implementation of skip path when input feature mapd number and ourput of residual block is different is using convolution of kernel size1. is it a problem? or just because my initialization. of parameters is not . good . so that i . need . to . wait . more . than 80 epochs to decrease lr by10

eriche2016 commented 8 years ago

here is my version. https://github.com/eriche2016/deep_residual_networks_cifar

gcr commented 8 years ago

Hm. In my case, the network begins making progress almost immediately. I found that weight initialization is important --- the torch default weights cause slower convergence.