pyro-ppl / pyro

Deep universal probabilistic programming with Python and PyTorch
http://pyro.ai
Apache License 2.0
8.58k stars 986 forks source link

Implement vectorized adaptive HMC #2902

Open eb8680 opened 3 years ago

eb8680 commented 3 years ago

Pyro's HMC and NUTS implementations are feature-complete and well-tested, but they are quite slow in models like the one in our Bayesian regression tutorial that operate on small tensors for reasons that are largely beyond our control (mostly having to do with the design and implementation of torch.autograd), which is unfortunate because these are often the subjects of new users' first encounters with Pyro. Running multiple MCMC chains in parallel is one way of dealing with this problem, but Pyro's MCMC algorithms currently only support process-level parallelism with torch.multiprocessing, which is slower, memory-hungry and also error-prone on some platforms like Google Colab and Windows.

This issue proposes implementing the "ChEES-HMC" algorithm described in this paper as a new MCMC kernel where vectorization over individual MCMC chains happens via broadcasting in an additional plate context, similar to the vectorization over guide samples in our Trace*_ELBO implementations. While this algorithm is unlikely to replace NUTS in all contexts, vectorization over a large number of independent chains may be especially useful in alleviating PyTorch-related performance issues in small models.

Note that this proposal is more narrowly scoped than the general suggestion in #2539 to support broadcasting-based parallelization in our existing MCMC kernels, which as @fehiepsi said in #2539 is probably best deferred until better auto-vectorization functionality a la JAX's vmap is added to PyTorch.

This would be a great starting point for a contributor with some probabilistic ML expertise who is interested in adding a high-impact feature to Pyro while learning more about some of the internal inference APIs. If that sounds like you, please speak up! We're happy to help review draft code or discuss design issues.

karthikayan4u commented 3 years ago

Hello @eb8680 , I am completely new to open source contribution. Although there are lot of alien terms for me in the above proposal, I am very open to learn and work on this proposal. Please, share some resources on the same.

eb8680 commented 3 years ago

Hi @karthikayan4u, thanks for your interest in contributing to Pyro! The immediate background information necessary to understand this proposal is in the paper I linked to about ChEES-HMC.

Do you have much experience with applied Bayesian statistics or probabilistic programming, or have you perhaps taken a class based on either the Bishop or Murphy textbooks on probabilistic machine learning? If not, this might be a difficult place to start, and I would suggest making your first contribution an example or tutorial, which is the most valuable thing you can do for almost any open source project and also a great way to familiarize yourself with Pyro. One particularly easy and helpful task would be porting a version of the CVAE tutorial on the Pyro web page to NumPyro.

karthikayan4u commented 3 years ago

I have very less knowledge on those topics you mentioned. So, I will start with the task you suggested. Meanwhile, I will try to take those classes you mentioned and come back to work on this proposal. Thank you @eb8680 .

Militeee commented 1 year ago

Hello @eb8680, did anybody implement it in the end? If not, I would be super interested in implementing some probabilistic ML algorithm like this one. I had a couple of courses in PML based on Bishop's book and I have been a Pyro user for quite a while now, so hopefully I should not be totally clueless. I will still need some help with the Pyro internals but I am super happy to learn more.

Cheers, S.

martinjankowiak commented 1 year ago

@Militeee no, this is still an open issue. please take a crack at it if you're interested : ) one way to do that would be to open a new targeted issue where you describe what you plan to implement. that would also be a first opportunity for other developers to give you feedback before you write much code

Militeee commented 1 year ago

Brilliant, I'll read the paper and open a new issue as soon as I have a plan for the implementation. Thanks!