pymc-devs / pymc

Bayesian Modeling and Probabilistic Programming in Python
https://docs.pymc.io/
Other
8.73k stars 2.02k forks source link

Stochastic Gradient Hamiltonian Monte Carlo #1958

Closed shkr closed 7 years ago

shkr commented 7 years ago

I was wondering if this is part of the roadmap, or if anyone is working on this implementation of Hamiltonian MH Sampler ?

If not then, I can work on the implementation and submit a PR.

https://arxiv.org/pdf/1402.4102.pdf

twiecki commented 7 years ago

@shkr Yes, that one has been of interest. @jsalvatier had some thoughts too, he might chime in. This paper was also relevant in that regard: http://aad.informatik.uni-freiburg.de/papers/16-NIPS-BOHamiANN.pdf

shkr commented 7 years ago

Thanks, thats very helpful! Skimmed through that paper estimation of the hyper parameters introduced in the original paper will make the sampler robust and user-friendly.

jsalvatier commented 7 years ago

It's important to note that Stochastic Gradient HMC unfortunately doesn't preserve the detailed balance.

It may still be quite useful because it will be a scalable way of getting close to the region of high probability.

I suspect that Stochastic Gradient Langevin Dynamics does preserve the detailed balance, though I haven't checked.

Langevin Dynamics only have a minor scaling penalty relative to HMC (O(n^1.33) vs O(n^1.25) ).

twiecki commented 7 years ago

@jsalvatier Good points.

@shkr Perhaps it's better to start with SGLD or SG Fisher Information (Max Welling).

shkr commented 7 years ago

This seems to be the latest (and comprehensive) paper on SGLD with http://people.ee.duke.edu/~lcarin/782.pdf - I am referring to this now for implementation, instead of the earlier paper given the evidence

asifzubair commented 7 years ago

Actually, this paper - https://arxiv.org/abs/1506.04696 - nicely summarizes all stochastic gradient based approaches.

shkr commented 7 years ago

I have read through the two papers. IMO there is much in common w/ the implementation of SGLD and SGFS enough that they can share a base call. I have a WIP PR submitted (it is not close at all to completion).

Any pointers on how (and where) to handle the batching of the data from observed variables ?

Currently, I am looking at:

https://github.com/pymc-devs/pymc3/blob/250e2f81a19c38a88b38be5cfef7a6c212890b1a/pymc3/tests/test_advi.py

and

https://github.com/pymc-devs/pymc3/blob/master/pymc3/variational/advi_minibatch.py

for reference

ferrine commented 7 years ago

Minibatch can be handled via callback or experimental pm.generator. You can use https://github.com/ferrine/pymc3/blob/master/pymc3/tests/test_variational_inference.py#L166 as a reference

twiecki commented 7 years ago

I think pm.generator is the best API and we should target that.

On Apr 4, 2017 8:15 AM, "Maxim Kochurov" notifications@github.com wrote:

Minibatch can be handled via callback or experimental pm.generator. You can use https://github.com/ferrine/pymc3/blob/master/pymc3/tests/ test_variational_inference.py#L166 as a reference

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pymc-devs/pymc3/issues/1958#issuecomment-291414963, or mute the thread https://github.com/notifications/unsubscribe-auth/AApJmGgC27CXq1rcPA8zsHvOnhgx5mfkks5rse39gaJpZM4MpsM8 .

asifzubair commented 7 years ago

Hi Folks,

some great discussion here. for completeness, i just wanted to add Betancourt's paper here - http://proceedings.mlr.press/v37/betancourt15.pdf which raises some concern for stochastic gradient methods. I thought it would be nice to be mindful of it.

Thanks!

shkr commented 7 years ago

@asifzubair Thanks for the reference. I am running into scaling issue with the number of parameters in the current implementation of SGFS in the PR #1977, with @twiecki CNN example problem using lasagne.

Do you have recommendations for other models w/ smaller set of parameters to test it on ?

twiecki commented 7 years ago

@shkr Maybe this one: http://pymc-devs.github.io/pymc3/notebooks/convolutional_vae_keras_advi.html Or http://pymc-devs.github.io/pymc3/notebooks/bayesian_neural_network_advi.html

shkr commented 7 years ago

@twiecki I will try those this weekend

philipperemy commented 7 years ago

@shkr any updates?

junpenglao commented 7 years ago

@philipperemy SGMCMC is already implemented by @shkr in pymc3 ;-) You can have a look of an example here