Open alashworth opened 5 years ago
Comment by bob-carpenter Friday Jul 31, 2015 at 19:59 GMT
I completely agree. Same for L-BFGS --- it should be abstracted from the application as much as possible for re-use.
Is there SGD in ADVI now? There shouldn't be given that there's no way for users to run it yet --- things should live on branches until they're ready to go.
We made the mistake of putting in higher-order autodiff and discrete sampling infrastructure before either were ready and it's just been a huge burden.
Thanks for the refs!
On Jul 31, 2015, at 3:23 PM, Dustin Tran notifications@github.com wrote:
It would be worth having something generic for all things related to stochastic gradient descent, to be separated from variational inference itself. E.g., a sgd class to have different stochastic gradient methods available, a learning rate class for testing various learning rates, a subsampling class, etc. This will eventually be necessary as we start working on more research tracks, e.g., Mandt and Blei (2014), Theis and Hoffman (2015), Tran et al. (2015).
It should also be applicable for computing the penalized MLE, so that the optimization interface of Stan also has SGD available for users.
— Reply to this email directly or view it on GitHub.
Comment by syclik Thursday Dec 01, 2016 at 05:20 GMT
Branch from feature/issue-1751-service-methods if you're going to work on this soon.
Issue by dustinvtran Friday Jul 31, 2015 at 19:23 GMT Originally opened as https://github.com/stan-dev/stan/issues/1575
It would be worth having something generic for all things related to stochastic approximations, to be separated from variational inference itself. E.g., a sgd class to have different stochastic gradient methods available, a learning rate class for testing various learning rates, a subsampling class, etc. This will eventually be necessary as we start working on more research tracks, e.g., Mandt and Blei (2014), Theis and Hoffman (2015), Tran et al. (2015).
It should also be applicable for computing the penalized MLE, so that the optimization interface of Stan also has SGD available for users.