BatchNormsync with Adam Optimizer

fyu / drn

Dilated Residual Networks

https://www.vis.xyz/pub/drn

BSD 3-Clause "New" or "Revised" License

1.1k stars 219 forks source link

BatchNormsync with Adam Optimizer #35

Open tarun005 opened 6 years ago

tarun005 commented 6 years ago

Is the bnsync code written specifically for SGD optimizer? The loss is not converging if I use and train the model with Adam optimizer.

d-li14 commented 5 years ago

@tarun005 Have you tested with SGD optimizer? Does it drive the training process to convergence?

tarun005 commented 5 years ago

Yes, the model converges with SGD, but same model does not if I replace SGD with adam.

d-li14 commented 5 years ago

@tarun005 Although I suppose that BN should be irrelevant to the optimization method, when I used the syncbn by just adding the folder lib to $PATH, I met an error saying 'segmentation fault'. What's your usage?

tarun005 commented 5 years ago

Agree that BN shouldn't be relevant to optimization method, but I have read somewhere that Adam requires global statistics at every iteration, so the implementation of BNsync given here could be an issue.

jakubLangr commented 4 years ago

Hi were either of you @tarun005 @d-li14 be able to upload the models to e.g. dropbox as they don't seem to be accessible on the Princeton site. That'd be awesome!