Closed eriche2016 closed 8 years ago
hi, i have also implemented residual net, but some how my test error can only drop to around 9 percent. the only difference is that my implementation of skip path when input feature mapd number and ourput of residual block is different is using convolution of kernel size1. is it a problem? or just because my initialization. of parameters is not . good . so that i . need . to . wait . more . than 80 epochs to decrease lr by10
hi, i have also implemented residual net, but some how my test error can only drop to around 9 percent. the only difference is that my implementation of skip path when input feature mapd number and ourput of residual block is different is using convolution of kernel size1. is it a problem? or just because my initialization. of parameters is not . good . so that i . need . to . wait . more . than 80 epochs to decrease lr by10
here is my version. https://github.com/eriche2016/deep_residual_networks_cifar
Hm. In my case, the network begins making progress almost immediately. I found that weight initialization is important --- the torch default weights cause slower convergence.
Hello!
We used the same preprocessing strategy as in the paper by cropping 2px off all sides. See lines 60-71 of https://github.com/gcr/torch-residual-networks/blob/master/data/cifar-dataset.lua#L60-71