Open b-remy opened 2 years ago
Good news: I reproduced the experiment of Toy model 1 from Schneider et al. (2015) and Both the MAP and Variational Inference are working :-) The only difference is that I have a fixed variance for the ellipiticity prior, while they have a hierarchical prior. I will investigate this later.
Using a MultivariateNormalTriL
for 100 gaussian galaxies, constant shear = [0.05 ,-0.05] and sigma_pix = 0.003 here are some contours:
Note : log_10(hlr)
is represented here.
Shear marginals:
To further validate this result, I would like to compare to HMC contours.
This issue is to track the development of a VI procedure, which is composed of two steps
In order to validate the procedure, I will use the same model to generate the simulations and for inference. For the results, I will compare to both the expected values used to generate the sims and the posterior contours from an HMC procedure.
1. MAP
25 Gaussian light profile galaxies, constant shear.
Params:
hlr
,e1
,e2
gamma1
,gamma2
I aimed to get the MAP running gradient descent on the negative log posterior
- log p (hlr, e, gamma, obs)
proportional to the negative log posterior.This objective being non Gaussian, it seems to be non trivial to obtain the global minimum, even with
Adam
. So my first strategy was to run 100 chains in parallel for 100 iterations ofAdam(lr=0.1)
and then 100 iterations ofAdam(lr=0.01)
.and
hlr ground simulations
hlr MAP
Augmenting the number of iterations didn't change the occurence of getting the global optimum...
Overall, the hlr paramters are always well fitted. However the shear and ellipticties being degenerated, the convergence to the expected value is hard to obtain (see the two examples at the end of this post). I would say that maybe this is not a big issue since we can continue to fit the mean during the VI part. Any other idea on this @EiffL?
2. Variational Inference
Once the MAP was obtained. I tried to fit the mean and covariance matrix of a multivarite normal distribution to the posterior (using tfd.MultivariateNormalTriL).
Optimization was done maximizing the ELBO, i.e.
E[log p(z,x)]- E[log q(z)]
, whereq
is my surrogate posterior. Here is how I wrote the ELBO in practice:Initialization:
loc = z_MAP
,scale = 0.1 * I_d
(scale.T @ scale = cov
so equivalent to the standard deviation).Optimized with
Adam(lr=1e-3)
for 200 iterations I get:But what I find weird is the size of the shear contours around the MAP. The error bars are one order of magnitude to high.
However the scale matrix for the shear, seems one order of magnitude too low
The mean posterior for the shear being
[0.06719618 0.01234444]
Here is another run of the computation of the MAP and VI and a bigger error for the MAP.
So I don't get why I obtain such wide contours with this corresponding scale matrix? And why the scale matrix seems to be under estimated.
Is the multivariate normal distribution inappropriate? The HMC contours seemed to be adapted to a Gaussian approximation.
Or do I make a mistake in the VI optimization?
Here is a full notebook for more details.