Closed giacomodecarlo closed 2 years ago
Thanks @giacomodecarlo! The code looks good to me. I would generalise it a little further and add the following functions in the distribution.h
file
// here wood_factor is L^{-1} *Lambda^T * Psi^{-1}
double multi_normal_lpdf_woodbury_chol(Eigen::VectorXd y, Eigen::VectorXd mean, Eigen::VectorXd sigma_diag, Eigen::MatrixXd wood_factor) {
double exp =
-0.5 * ((datum.transpose() - state.mu)
.dot(state.psi_inverse * (datum.transpose() - state.mu)) -
(wood_factor * (datum.transpose() - state.mu)).squaredNorm());
double base = -0.5 * state.cov_logdet + NEG_LOG_SQRT_TWO_PI * dim;
return base + exp;
}
double multi_normal_lpdf_woodbury(Eigen::VectorXd y, Eigen::VectorXd sigma_diag, Eigen::MatrixXd lambda) {
Eigen::MatrixXd wood_chol = ...
return multi_normal_lpdf_woodbury_chol(...);
}
then it becomes very easy to test that multi_normal_lpdf_woodbury
returns the same value as stan::math::multi_normal_lpdf
One last question: is your implementation scaling linearly in p
? I know that naively one would still scale quadratically in p
I would say it is scaling linearly, but it would require a better complexity analysis, I don't think there is a term that clearly dominates complexity, my guess would be something like O(pq^2n*g), not sure about n and g since now likelihood evaluation is fast. Empirically I can say that with p=5000 it takes about 10 times as much with respect to p=500. Using the parametrization of cov_wood in the code we have no (pxq)(qxp) matrix multiplication (only the diagonal one which is clearly faster), so p^2 should never appear in the complexity, but again I should think about it more carefully to check other parts of the algorithm.
I'm now going to implement these functions and test them against the stan implementation.
I implemented a test to check multi_normal_lpdf_woodbury
is equal to stan::math::multi_normal_lpdf
and also manually checked from inside the likelihood evaluation in the hierarchy with some prints that the function is working properly.
Just one last minor commend for the documentation, then it's good to go!
I just added the comment, thanks!
Thanks @giacomodecarlo for the great effort! Very nice contribution!
Good afternoon everyone, I just changed the code to get a much faster likelihood evaluation as discussed in this post.
I attach a small pdf describing what is now kept in the state. Woodbury.pdf The speedup with p=500 (number of features) q=5 (number of latent factors) is about 100x, and it grows as p grows. I still need to double check the computations in the code, but it seems to be working already.