Proper VB scaling - Githubissues

TTitscher commented 3 years ago

To deal with numerically large parameters, the current VB implementation scales the provided prior by its mean and infers this scaled parameters. The main benefit is that the entries of the Jacobians and the precision are far closer to one.

This is not done for numerically small parameters, as there has to be some kind of eps to avoid division by zero (for zero mean parameters). Most of my problems are then solved for eps=1.e-20, but the proper way would be to use some kind of scaling of the precision. For normal distributions, there is this zero-mean unit-variance transformation. How would that work for MVN?

I read something here where an eigenvalue decomposition of the precision/covariance is used, COV = Q . EVs . Q.T, where scaling with Q is performed. However, in our case, COV is diagonal so Q is Identity and there will be no scaling...?

joergfunger commented 3 years ago

If you take a transformation with Q sqrt(eigenvalues), your likelihood becomes (Q sqrt(e) X)^T Cov^(-1) * (Q sqrt(e) X) = X^T X (also zero mean, unit std and independent)

joergfunger commented 3 years ago

So the scaling with Q makes the variables independent (not required if they are already independent) and the scaling with the srt of the eigenvalues makes them unit std.

BAMresearch / bayem

Proper VB scaling #47