blackjax-devs / blackjax

BlackJAX is a Bayesian Inference library designed for ease of use, speed and modularity.
https://blackjax-devs.github.io/blackjax/
Apache License 2.0
758 stars 97 forks source link

Implement multinomial HMC #383

Open rlouf opened 1 year ago

rlouf commented 1 year ago

As a step between HMC and NUTS it is a very good algorithm to have in Blackjax to demonstrate the flexibility of our internals. It also supposedly produces better samples that standard HMC.

GStechschulte commented 1 year ago

I have been thinking about working on this issue. However, after reviewing some of the literature on HMC, dynamic HMC, and NUTS, it is a bit unclear to me the explicit difference between dynamic HMC and NUTS.

NUTS versus HMC, at a high level, adaptively selects the number of steps such that the algorithm figures out when the path begins to turn around (so it does does turn back on itself; hence the U-turn), stopping the simulation at that point, and takes a sample to more effectively explore the target distribution. Ordinary HMC requires the user to manually tune the step size and number of steps. In Blackjax, the window adaption function adapts the values of the inverse mass matrix and step size parameters, leaving the number of steps adaptation to the NUTS algorithm.

From what I can gather, dynamic HMC and NUTS both "let go" of the notion of a fixed number of steps for the user to tune. However, their approaches for doing so are different. Dynamic HMC refers to a dynamic trajectory length (slide 21) using a growing tree to increase simulation trajectory until a no-U-turn criterion. But isn't NUTS also extending the simulated trajectory path (both forwards and backwards) until a condition is met to detect the U-turn?

If someone in the community could point me to a resource (or explain to me) with a clear difference between NUTS and dynamic HMC, that would be greatly appreciated.

Below is a list of resources I consulted for understanding this difference:

rlouf commented 1 year ago

I should have named this issue "multinomial HMC", I am sorry for the confusion. It is simply HMC but where you sample the trajectory instead of building it linearly, it is described at the end of Bettencourt's introduction to HMC.

GStechschulte commented 1 year ago

I should have named this issue "multinomial HMC", I am sorry for the confusion. It is simply HMC but where you sample the trajectory instead of building it linearly, it is described at the end of Bettencourt's introduction to HMC.

Ahhh that makes sense now. Haha no worries mate and thanks for the clarification. After a bit of researching I really began to question myself haha.

rlouf commented 1 year ago

Feel free to implement it, all the pieces are there 🙂