aesara-devs / aehmc

An HMC/NUTS implementation in Aesara
MIT License
33 stars 6 forks source link

Incorrect momentum samples generated when inverse mass matrix is 2D. #92

Closed zoj613 closed 1 year ago

zoj613 commented 1 year ago

Description of your problem or feature request

The aim is to compute the lower triangular factor of the inverse mass matrix (lets call this $L$) such that the inverse mass matrix can be factored as $M^{-1} = LL^T$. Next we want to use $L$ in order to generate a sample from $\mathcal{N}(0, M)$ using the formula $(L^T)^{-1} z$ where $z$ is a standard normal sample. This is because the actual distribution's covariance ($M$) is factored as $(L^T)^{-1}L^{-1}$. From looking at https://github.com/aesara-devs/aehmc/blob/d54e2d05512d8d3d4aea92b8732854e6794296e8/aehmc/metrics.py#L54-L57, it seems like the call to cholesky returns $L$ (since by default the lower triangular matrix is returned). Then we try to invert this lower triangular to obtain the actual factor to use for generating a sample from this distribution (using my notation above, this is $(L^T)^{-1}$. But the call to solve_triangular appears to be solving $LX = I$ which returns $L^{-1}$, instead of $(L^T)^{-1}$.

Please provide any additional information below. To fix this we need to either make sure cholesky returns $L^T$ or make sure solve_triangular call solves the right equation by specifying trans=True.

Versions and main components