Decomposing a MVN - Githubissues

kaitejohnson commented 4 months ago

Goal

I think (but could be way off base here) that the best way to do inference on the spatial(site-level) deviations $\eta$ from the log of the state $R(t)$ drawn from a MVN with covariance matrix $\Sigma$

$$\epsilon \sim MVN(0, \Sigma)$$

is to decompose the parts that are a function of the parameters and the data ($\Sigma) from the realized "noise" drawn from iid standard normal variables (e.g. $X \sim MVN(0, \mathbb{I})$ ).

This might be because I have done too much of this with vectors and not enough with matrices, but I think we want the analogous of the centered --> non-centered parameterization of:

$$\eta \sim N(0, \sigma)$$

which can be written as:

$$\eta = \sigma N(0,1)$$

So

$$\epsilon \sim MVN(0, \Sigma)$$

rewritten as:

$$ \epsilon =f(\Sigma) MVN(0, \mathbb{I}) $$

And based on no linear algebra skills of my own, @cbernalz has that:

$$X \sim MVN(0, \mathbb{I})$$

then

$$\sqrt(\Sigma)X \sim MVN(0, \Sigma)$$

So I think we can write:

$$ \epsilon =f(\Sigma) MVN(0, \mathbb{I}) = \sqrt(\Sigma)MVN(0, \mathbb{I}) $$

Let me know if this seems completely off base @dylanhmorris. Part of me is thinking that we should be using the Cholesky factorbut I think that would only be relevant if we didn't have $\Sigma$ as a direct function of scalar parameters and data, and instead were estimating elements of the covariance matrix directly.

damonbayer commented 4 months ago

You're on the right track. There is a relevant example in the Stan User Guide https://mc-stan.org/docs/stan-users-guide/efficiency-tuning.html#multivariate-reparameterizations

kaitejohnson commented 4 months ago

Ok TY @damonbayer that was exactly what I needed.

I think what we will want then is if $\epsilon_t$ are the drawn from an MVN with mean 0 and covariance $\Sigma$

$$\epsilon_t \sim MVN(\mu, \Sigma)$$

We can write this in terms of $\alpha_t$, where $\alpha_t \sim N(0,1)$ and has the same dimensions (number of sites) as $\epsilon_t$ then:

$\epsilon_t = \mu + L \alpha$

where $LL^T = \Sigma$

And since $\mu = 0$ because we are estimating the deviations between state and site $R(t)$ s we can just say:

$\epsilon_t = L \alpha$ $LL^T = \Sigma$

In stan, this will look like:

data{
matrix[n_sites, n_sites] = distances
}

parameters{
matrix [n_sites, n_times] alpha;
+ {corr_func_params};
}

transformed_parameters{
matrix [n_sites, n_times] epsilon;
matrix[n_sites, n_sites] L ;

Sigma = some_function_of_params_and_data(corr_func_params, distances);
L = choleskly_decompose(Sigma);
for(i in 1:n_times){
  epsilon[,i] = L*alpha[,i]
}

model{
to_vector(alpha )~ std_normal()
corr_func_params_priors ~ some distribs()
}

damonbayer commented 4 months ago

@kaitejohnson Having a hard time tracking the dimensionality of $\epsilon$ and $\alpha$. Could you check your writeup again?

cbernalz commented 4 months ago

@damonbayer They should have the same dimensionality. We give that $\alpha_{t}\sim\text{MVN}(0,I)$, for each time point. Then we can use the transformation from the link you provided to get $\epsilon_{t}\sim\text{MVN}(0,\Sigma)$. In the model block, to_vector(alpha)~std_normal(), I believe what @kaitejohnson is doing is letting every $\alpha_{i,j}\sim\text{Normal}(0,1)$, which should be the same as doing $\alpha_{t}\sim\text{MVN}(0,I)$. @kaitejohnson feel free to correct me in any way.

CDCgov / ww-inference-model

Decomposing a MVN #41

Goal