Large test error - Githubissues

chenhsuanlin / inverse-compositional-STN

Inverse Compositional Spatial Transformer Networks :performing_arts: (CVPR 2017 oral)

MIT License

318 stars 63 forks source link

Large test error #4

Closed moshanATucsd closed 7 years ago

moshanATucsd commented 7 years ago

Hi,

Nice work! I am trying to reproduce the results shown in the paper, but using IC-STN with depth = 2, the test error after 100000 epochs is 14.72%, and the loss is 0.5346. For IC-STN with depth = 4, the test error goes up dramatically in the end, and it seems that there is overfitting. I am running the train.py without any modification.

I am wondering whether do we have to change the hyper parameters such as the learning rate, training iterations, in order to obtain the results like Table 2 in paper, where errors are around 1%?

chenhsuanlin commented 7 years ago

Sorry for the late reply! I'm not exactly sure what you meant by overfitting? If you call out Tensorboard and see the error suddenly rises and not going back down, you may want to restart and resume from the last good checkpoint. We do find that IC-STNs can be unstable to train sometimes (which is something we're further investigating into). Other than that, I'd suggest training the networks longer - my reported results were trained for 200K iterations. You could also try with different initialization settings to see if helps convergence.

moshanATucsd commented 7 years ago

I see, thanks! I tried training for 200K iterations but the error is still large in the end with depth = 4. I found that adding batch norm layers could help improve the stability and convergence, although I am not sure whether it is a good idea to add batch norm layers in the recurrent version of STN.