stan-dev / stan

Stan development repository. The master branch contains the current release. The develop branch contains the latest stable development. See the Developer Process Wiki for details.
https://mc-stan.org
BSD 3-Clause "New" or "Revised" License
2.58k stars 368 forks source link

Refactor stochastic gradient things in ADVI #1575

Open dustinvtran opened 9 years ago

dustinvtran commented 9 years ago

It would be worth having something generic for all things related to stochastic approximations, to be separated from variational inference itself. E.g., a sgd class to have different stochastic gradient methods available, a learning rate class for testing various learning rates, a subsampling class, etc. This will eventually be necessary as we start working on more research tracks, e.g., Mandt and Blei (2014), Theis and Hoffman (2015), Tran et al. (2015).

It should also be applicable for computing the penalized MLE, so that the optimization interface of Stan also has SGD available for users.

bob-carpenter commented 9 years ago

I completely agree. Same for L-BFGS --- it should be abstracted from the application as much as possible for re-use.

Is there SGD in ADVI now? There shouldn't be given that there's no way for users to run it yet --- things should live on branches until they're ready to go.

We made the mistake of putting in higher-order autodiff and discrete sampling infrastructure before either were ready and it's just been a huge burden.

Thanks for the refs!

On Jul 31, 2015, at 3:23 PM, Dustin Tran notifications@github.com wrote:

It would be worth having something generic for all things related to stochastic gradient descent, to be separated from variational inference itself. E.g., a sgd class to have different stochastic gradient methods available, a learning rate class for testing various learning rates, a subsampling class, etc. This will eventually be necessary as we start working on more research tracks, e.g., Mandt and Blei (2014), Theis and Hoffman (2015), Tran et al. (2015).

It should also be applicable for computing the penalized MLE, so that the optimization interface of Stan also has SGD available for users.

— Reply to this email directly or view it on GitHub.

syclik commented 7 years ago

Branch from feature/issue-1751-service-methods if you're going to work on this soon.