Gaussian kinetic energy centered at nonzero point

tpapp / DynamicHMC.jl

Implementation of robust dynamic Hamiltonian Monte Carlo methods (NUTS) in Julia.

Other

242 stars 21 forks source link

Gaussian kinetic energy centered at nonzero point #100

Closed jzhan039 closed 3 years ago

jzhan039 commented 4 years ago

Hi Tamas,

If I wanted to construct a Gaussian kinetic energy centered at nonzero points, how can I do it in this library? For example, if I have 2 parameters in the distribution, (x, y), and somehow I know my problem has an approximated kinetic energy (|p| - p0)^2 / 2M, where |p| = (px^2 + py^2)^(1/2), p0, M are constant, I think using a Gaussian kinetic energy centered at p0 will be more efficient, right? And it's the radial length of momentum (px, py) obey Gaussian distribution centered at p0, not each one does independently, I'm not sure if there are other things I need to care about other than redefining the kinetic energy.

Thanks.

Jin

tpapp commented 4 years ago

All the NUTS/HMS implementations I am familiar with use a symmetric KE specification, mostly Gaussian (I am aware of some experiments with fat-tailed KEs, but they didn't improve much in practice). I am not aware of any theoretical reasons for using non-symmetric KEs, but maybe I missed something.

Currently the symmetry of KE is very deeply engrained in the implementation of DynamicHMC — eg see the docs of KineticEnergy. If you really want to experiment with this, you would need to define another type and reimplement pretty much all of src/hamiltonian.jl, and then add a reversibility correction somewhere (this I would have to look into, but I am willing to extend the interface if you really want to do this).

Unless you are an expert in differential geometry and you are convinced that this would help you, I would really advise against this, it is quite a bit of work with dubious benefits.

jzhan039 commented 4 years ago

It's similar to the Hamiltonian dynamics in a magnetic field. The Hamilton's equations describe the time evolution of coordinates and the canonical momentum. But the kinetic momentum has a shift from the canonical momentum. I just found a reference https://arxiv.org/abs/1607.02738v2, where they introduce the magnetic HMC, and give examples showing improvements for the method.

tpapp commented 4 years ago

I agree that it is very interesting. Please take a look at the existing code and let me know if you want to experiment with this. I am very happy to help by making generalizations to the API to allow this; as I said above I would need to do momentum flipping for the Metropolis steps, but there may be something else.

jzhan039 commented 4 years ago

I realized that it could be mathematically equivalent to a special case of Riemann HMC, or at least comparably efficient. Maybe it's not worth the time. I'll take a deeper thinking some time.

tpapp commented 4 years ago

OK, let's leave this issue open in the meantime.

Incidentally, the tests already contain mixtures like the one mentioned in the paper, they just fail with NUTS if the "valley" is too low (as expected). This can happen a lot in real-life models, so I am interested in fixing it. I am holding off on RHMC since second derivatives are not really practical with AD at the moment.

tpapp commented 3 years ago

I am closing this because of lack of activity; feel free to ping here if you are still interested and I will reopen.