Questions about the learned covariance matrix

WeihanLikk commented 2 years ago

Hi, I use the code to train GP-VAE with Physionet dataset, but I found the learned covariance matrix looks like this (first batch, first zdim):

where only its diagonal contains valid values, others are all close to zero, which is different to prior matrix:

And my question is: could this covariance matrix be able to smooth the latent dimension among time? I randomly extract some latent variables (seems not very smooth):

Besides, I manage to find something about this, which comes from line 117 of ./lib/models.py:

prec_tril = prec_tril + eye

I think the above code is to keep numerical stability when matrix inverse: A^-1 = (A + eps * identity)^-1, but the value of eps is set to 1 and may be too large, then causes the learned covariance matrix to only have diagonal values? Howerver, when I change eps to a smaller value, this prec_tril matrix can sometimes be not invertible.

So, my another questions is: is that possible to learn a covariance matrix that could smooth/denoise the latent variables (have more elements other then diagonal ) by using the precision matrix presented in the paper?

Thanks!

dbaranchuk commented 2 years ago

Hi,

Yes, indeed, numerical stability is a problem there and adding diagonal matrix helps to deal with it. However, in my opinion, it shouldn't be responsible for zero non-diagonal values in the posterior covariance matrix.

To make z smother, I would consider increasing the coefficient in front of the prior term (β) in ELBO. Higher β should force the posterior distribution to be closer to the prior. However, as far as I remember, it led to worse performance on validation.

WeihanLikk commented 2 years ago

OK, thanks for your reply

ratschlab / GP-VAE

Questions about the learned covariance matrix #15